|
CSE-6490B
Information Integration
York University
Fall 2013
|
Small Project
XQuery
|
|
|
|
This
project is to integrate two XML sources using XQuery.
The project should be done as teams
of two or three people.
|
|
|
|
Do
the following.
- Find two reasonably large XML data sets
that are related topic-wise.
Each should be reasonably large;
say, 500 nodes or so.
- Compose two or three
—
how many tasks should be the same as the size of your team.
—
integration tasks in English
that require integration of information from the two sources.
- Devise XQuery queries for each of your integration tasks.
|
|
|
|
Deliver
the following in a project report.
- The two data sources as XML files.
- For each task:
- the integration task in English;
- your XQuery query for the task;
and
- the evaluation of the XQuery query against the two sources.
- A brief write up.
- Attribution: Project title, members of the team, etc.
- Description of the two data sources,
including where they are from and what they describe.
- A discussion of the strengths and weaknesses
of XQuery as an approach to integration,
as you experienced it in this project.
(This ought not be more than half a page
—
a couple of paragraphs
—
in length.
Do all in plain text (UTF-8 is fine).
Turn into godfrey by email.
|
|
|
|
|
|
|
Here
are a few leads on tracking down interesting XML data sources
(data sets).
|
|
|
|
stand-alone XQuery engine
You
likely will want an XQuery engine to play with,
for validating your XQuery queries,
and for producing the results for your integration tasks.
You are welcome to use any available engine,
or online XQuery engine for the task,
that supports at least XQuery 2.0.
In class,
I have been using
Zorba.
Available well supported, open source XQuery engines are
- BaseX:
a native XML database system with XQuery.
(Requires a server to be running.)
- eXist:
a native XML database system with XQuery.
(Requires a server to be running.)
- XQilla:
Supports XQuery.
(A native XML database system, I believe.)
- Zorba:
a stand-alone, command-line XQuery evaluator.
Zorba is up and running on indigo and red!
Special thanks to Paul, Seela,
and the tech team of EECS for putting this up
on short notice for us.
(It will push out to all the PRISM machines within a day.)
|
|
in the browser
The
other way to run XQuery is to expose it in a web broswer
through JavaScript,
such as Firefox,
which has the facilities hidden underneath.
While XPath is readily available through JavaScript, XQuery is not.
This is a bit messy,
but convenient,
and quite fun.
XQIB
provides a JavaScript “library”
that makes running XQuery in the browser accessible.
One grabs the mxqueryjs
—
which contains mxqueryjs.nocache.js
—
and plants it under one's www-site.
Or link it remotely in your www-page containing an XQuery query.
(You could link to mine in the example below.)
See
titles.html
under my www home directory as an example.
Look at its source.
It queries the title nodes out of the
bibliography.xml example file
(a local copy, in this case).
This approach has extra complications.
-
Because of name space issues
—
the default name space within the HTML document is
for the HTML document itself,
not the source XML document you are querying
—
in the XQuery query,
node calls have to provide a proper name space,
or wildcard it so any name space will match.
E.g.,
$mydoc//*:title
instead of just
$mydoc//title.
-
How to return the results?
Easiest is to modify the HTML page to display them.
This requires modifying the DOM of the page,
and the results should be cast as HTML rather than general XML,
so the browser will render the results correctly.
(The example shows this.)
|
|
|
parke godfrey
|
|
|