This post originated from an RSS feed registered with Java Buzz
by Bill de hÓra.
Original Post: RDF hacking for fun and profit
Feed Title: Bill de hÓra
Feed URL: http://www.dehora.net/journal/atom.xml
Feed Description: FD85 1117 1888 1681 7689 B5DF E696 885C 20D8 21F8
I have to say, after a few years in the wilderness, coming back to RDF to do some hacking has been both fun and instructive. So, what's changed? The community. I'm slightly older and a lot less cynical about the whole technology after being very excited about RDF around the turn of the century. I became pretty annoyed at the direction the RDF community was taking starting in 2001; by 2002 I had lost much of my interest. During that time I moaned a lot and generally wasn't very helpful (sorry). The other thing that's changed now is that the community's expectations seemed to have settled to something sane, especially around the extent and value of formal approaches on the Internet. The whole DL and formal logic gung-ho seem to have eased up a lot in the last two years, thankfully. No doubt some people felt that was a necessary growing pain for the technology, but it was just as much a pain to have to have really smart KR people tell you you were wrong, wrong, wrong, at various levels of politeness, when you wanted to get something useful out the door and iterate. Especially tough if you knew your AI history and where the whole KR sheebang could end up versus what counts for deployment on the Web. The tools. The tools are so much better now. I've had Jena in a small-scale production environment for over 6 months, acting as the ham in an XMPP and Hibernate sandwich. It works a treat. At some point they might need to go back and clean up the APIs in a breaking way - there's some junk DNA lying about, understandable as the API has travelled through about 2.5 iterations of RDF at this point. But the core implementation seem to be solid. I find 4Suite to be stable software (tho' I'm not sure the RDF stuff is active anymore - Uche et al have been working on anobind most recently) I've been using rdlib and sparta recently and those are very neat. Sparta is in good shape for a 0.7, and the rdflib API is rather beautiful (tuplespace fans will love it). Dave Beckett's Redland is really impressive; the amount of work that has gone into it is incredible. Short version: the amount of work done by the RDF community in the last couple of years is humbling. The web. The web is now more machine-oriented than a few years ago. Much more. The RDF community saw this would come to pass before anyone else, I think, but perhaps not quite in the way it has turned out - RSS, WS and REST-as-deployed, rather than intelligent software agents. Even so, those technologies are likely to start creaking on the data front - arguably WS and REST-as-deployed already are at that point. As the networking and application protocol work gets bedded down, the new low-hanging fruit becomes extensible data formats sprinkled with semantic constraint pixie dust rather than type annotations and namespaces (media-types remaining useful). RDF-Forms, some people's re-examination of description languages, and the interest in speech acts are just the beginning. Shipping. I'm not sure how useful RDF is for explicit data representations over XML and relational tables, but as an internal format for applications and machine level chit-chat it is a decent option that you could be looking at before rolling your own configuration formats. Less code, more data. Now, people will point at how Mozilla's RDF is a millstone, (and they would be right), but we are 5 years on from that - the use idioms are a known today. You can even write something approaching sane RDF/XML once you avoid that nasty striping idiom. Potential. My current work on desktop client using RDF to manage application state makes me think that a simple reasoner (a la cwm) could get into a mobile device within two years and such a reasoner is possible now for desktop aggregators, albeit being a tough enough programming exercise. And when you're done what's still needed there is a reporting language from which to drive the views. But if you had all that? Then that could push the kinds of things the folks at Nature have been doing right into the client (the way Nature is using RSS is extremely cool, and also well beyond the commercial state of the art). Everyone would get the equivalent of an an embedded SQL engine inside their aggregators working over their RSS data. Such reasoners available for consumer-grade software would turn the industry being built on RSS infrastructure on its head, as the ability to innovate with data would accelerate drastically. Imagine being able to cross filter and repurpose data on your phone instead of waiting for Technorati, Amazon or Yahoo! to get round to providing a cool new service. Or put another way, why wait for the services when you can generate the same views locally? (and then SMS them to your mates). The market emphasis could shift from rich clients to rich data very quickly and would, I imagine, force Web2.0 businesses to expose their data much more transparently than happens today (otherwise they don't get to participate in the user's views). If that happens, the current extensibility models available today in RSS and Atom might not offer any competitive advantage - writing new code and upgrading the aggregator is going to be too slow to matter. In this regard I think the WinFS approach was boiling the ocean. WinFS is like EAI for the desktop, when a few hacks and a webserver would get most of the way there. It would have been enough to have reporting and searching for incoming RSS data built into the desktop as a first cut. A smarter filesystem could have been done later after the approach was proved to work and after you had proxied a My Documents feed behind an IIS daemon. Anyway, enough analyst-speak :) All in all, I would say this RDF stuff is just about ready for a second look. The big question is whether the world can get past the Semantic Web hype and bluster from years gone by to see the value....