After lunch (which I skipped to exercise) and a nap (ad-hoc, in front my laptop in the common room), we're on to XML database applications with Chris Wallace. He's starting off with some XQuery examples. The backing data is in an XML database (eXist).
Heh. He says that XQuery and eXist are the most fun he's had in software since Smalltalk, which he's used since 1983. The focus with these tools is on data more than functionality. He's doing all this to explore the design space (XML Databases and Documents). In terms of information systems, the focus here is on semi-structured data (RSS, anyone :) ). The problem space includes spreadsheets, documents, ad-hoc databases, and web integrated data.
The database he's using supports XQuery, XUpdate, XSLT, XQuery extensions, and free text searching. It supports a RESTful interface (Java servlets), SOAP, and XML-RPC. One of the example applications he's working on is a Faculty Online Database - currently the data exists across Access, SQL Databases, flat text files, spreadsheets, etc. The plan is to simplify all that and still support distributed data ownership. Code:
- 3000 lines of XQuery
- 3000 lines of XSLT
- 300 of XSD (One schema)
- 10 lines of PHP (not much web work done yet)
- 25 pages online thus far
When storing data, trying to use "real world" identifiers as much as possible (names, room numbers, etc). Reduces the gap between the real world domain and the system, but it does have issues - you can easily hit duplicates (example: if I mention "Dave Thomas", which one do I mean? pragDave, or Bedarra Dave?).
In terms of data, decided against using attributes - just went with more elements. Integrity? Schema validation is too weak and too restrictive. NXD stores any well formed XML. Referential Integrity? RDBMS' are "eager, integrity failures have to be repaired outside the db. NXD - stores data on demand, but integrity failures can be persisted. repair is inside the db. XML ids only checked within a document, NXD stores all nodes with internal ids.
For information systems, veracity of the model is what's important.
Functionality delivered via:
- XQuery generating HTML
- code moving to lunction libraries and XSLT as it matures
- XQuery for request input, sessions, selection of nodes, computation of views
- XSLT to generate the interface
- CSS for presentation style