Summary
An IBM developerWorks article provides a detailed introduction into XQuery, the recently finalized XML data query standard. As an increasing number of database management systems and developer tools support XQuery, obtaining XML directly from the database starts to become a viable alternative to SQL-based querying.
Advertisement
In late 2006, XQuery became an official W3C Recommendation, extending several previous XML standards, such as XML 1.0, and XPath 2.0, with constructs that help extract data from large XML documents. To reflect the latest XQuery specification version, Nicholas Chase recently updated his in-depth IBM developerWorks tutorial, Process XML using XQuery (free IBM developerWorks registration required).
Chase notes that,
XQuery is a good example of the interrelation of XML specifications and W3C Recommendations. The XQuery Working Group, along with the XSL Working Group, is also responsible for XPath 2.0, which includes much of the power developed for XQuery. In fact, XQuery is part of an entire group of Recommendations approved together.
According to Chase's tutorial, XQuery can be used to query both XML files—on the local file system or retrieved from a remote network location—as well as database management systems that support an XQuery interface. The latter category includes the major enterprise DBMSs, such as SQL Server, DB2, and Oracle's database.
Using XQuery to retrieve information from XML files seems like a natural choice, since XQuery offers capabilities beyond XPath and similar XML data access APIs. However, according to Chase's article, XQuery makes sense even in the case of retrieving data from a relational database, especially since output XML can be directly transformed to, say, a final XHTML format by the XQuery API. That XHTML can then be returned to a browser. Chase notes that XQuery provides both the data querying and data transformation capabilities, alleviating the need to use an intermediary translation layer, such as XSLT:
FLWOR [FOR-LET-WHERE-ORDER-RETURN] statements are the closest thing that XQuery has to an SQL statement. With FLWOR statements, you can create very specific queries in a more natural way than XPath 1.0 statements did.
For example, consider this request: "Using bib.xml, provide bookInfo elements for the Addison-Wesley books, with the content of each consisting of the title element." Certainly, you could do it with XSLT, but it would be impossible to do using XPath 1.0 by itself. You could select the title elements, but you couldn't add them into the new bookInfo element... Using the FLWOR statement, you can create the desired results:
for $book in doc("http://www.bn.com/bib.xml")//book
let $title := $book/title
where $book/publisher = 'Addison-Wesley'
return
<bookInfo>
{ $title }
</bookInfo>
Chase's article provides more complex examples as well, where the return part of the query produces output that can be directly returned to a presentation layer.
While many developers have a natural aversion to using a lot of XML in an applications, the XQuery-based approach seems to simplify the business tier. In addition to needing less application-level code, most of the computation is pushed into the database, presumably an ideal location for data-intensive computation.
What do you think of the XQuery-based approach outlined in Chase's article? Where do you see XQuery fit in your enterprise architecture?
> <p>While many developers have a natural aversion to using > a lot of XML in an applications, the XQuery-based approach > seems to simplify the business tier. In addition to > needing less application-level code, most of the > computation is pushed into the database, presumably an > ideal location for data-intensive computation.</p>
Can you elaborate on this? I can imagine there would be cases where you might want to do XQuery in the DB but why would this be advantageous in general? As was discussed in another of your recent posts, it is often the case that the database is not scaled or scalable.
Why put more work in the database that could easily be done in a server or even on the client? Unless the XML documents are huge, there seems little reason to me to put more load on the DBMS.