Java Buzz Forum - Extensions v Envelopes

Articles |
News |
Weblogs |
Books |
Forums

Artima Forums | Articles | Weblogs | Java Answers | News

Sponsored Link •

Java Buzz Forum
Extensions v Envelopes

0 replies on 1 page.

Welcome Guest
Sign In

Back to Topic List

Reply to this Topic

Search Forum

Threaded View


Previous Topic		Next Topic

Flat View: This topic has 0 replies on 1 page

Bill de hÓra

Posts: 1137
Nickname: dehora
Registered: May, 2003

Bill de hÓra is a technical architect with Propylon

Extensions v Envelopes

Posted: Nov 28, 2009 8:53 AM

This post originated from an RSS feed registered with Java Buzz by Bill de hÓra.
Original Post: Extensions v Envelopes Feed Title: Bill de hÓra Feed URL: http://www.dehora.net/journal/atom.xml Feed Description: FD85 1117 1888 1681 7689 B5DF E696 885C 20D8 21F8	Latest Java Buzz Posts Latest Java Buzz Posts by Bill de hÓra Latest Posts From Bill de hÓra

Here's a sample activity from the Open Social REST protocol (v0_9):

<entry xmlns="http://www.w3.org/2005/Atom">
   <id>http://example.org/activities/example.org:87ead8dead6beef/self/af3778</id>
   <title>some activity</title>
   <updated>2008-02-20T23:35:37.266Z</updated>
   <author>
      <uri>urn:guid:example.org:34KJDCSKJN2HHF0DW20394</uri>
      <name>John Smith</name>
   </author>
   <link rel="self" type="application/atom+xml"
      href="http://api.example.org/activity/feeds/.../af3778" />
   <link rel="alternate" type="application/json"
      href="http://example.org/activities/example.org:87ead8dead6beef/self/af3778" />
   <content type="application/xml">
       <activity xmlns="http://ns.opensocial.org/2008/opensocial">
       <id>http://example.org/activities/example.org:87ead8dead6beef/self/af3778</id>
           <title type="html"><a href=\"foo\">some activity</a></title>
         <updated>2008-02-20T23:35:37.266Z</updated>
         <body>Some details for some activity</body>
           <bodyId>383777272</bodyId>
         <url>http://api.example.org/activity/feeds/.../af3778</url>
           <userId>example.org:34KJDCSKJN2HHF0DW20394</userId>
       </activity>
    </content>
</entry>

It's 1.1 kilobytes. I'll call that style "enveloping". Here's an alternative that doesn't embed the activity in the content and instead use the Atom Entry directly, which I'll call "extending":

<entry xmlns="http://www.w3.org/2005/Atom"
       xmlns:os="http://ns.opensocial.org/2008/opensocial>
   <id>http://example.org/activities/example.org:87ead8dead6beef/self/af3778</id>
   <title type="html"><a href=\"foo\">some activity</a></title>
   <updated>2008-02-20T23:35:37.266Z</updated>
   <author>
      <uri>urn:guid:example.org:34KJDCSKJN2HHF0DW20394</uri>
      <name>John Smith</name>
   </author>
   <link rel="self" type="application/atom+xml"
      href="http://api.example.org/activity/feeds/.../af3778" />
   <link rel="alternate" type="application/json"
     href="http://example.org/activities/example.org:87ead8dead6beef/self/af3778" />
   <os:bodyId>383777272</os:bodyId>
   <content>Some details for some activity</content>
</entry>

It's 686 bytes (the activity XML by itself is 460 bytes). As far as I can tell there's no loss of meaning between the two. 545 bytes might not seem worth worrying about, but all that data adds up (very roughly 5.5Kb for every 10 activities, or 1/2 a Meg for every 1000), especially for mobile systems, and especially for activity data. I have a long standing belief that social activity traffic will dwarf what we've seen with blogging and eventually, email. If you're a real performance nut the latter should be faster to parse as well since the tree is flatter. The latter approach is akin to the way microformats or RDF inline into HTML, whereas the former is akin to how people use SOAP.

Ok, so that's bytes, and you might not care about the overhead. The bigger problem with using Atom as an envelope is that information gets repeated. Atom has its own required elements and is not a pure envelope format like SOAP. OpenSocial's "os:title", "os:updated", "os:id", "os:url", "os:body", "os:userId" all have corresponding Atom elements (atom:title, atom:id, atom:link, atom:content, atom:url). Actually what's really interesting is that only one new element was needed using the extension style, the "os:bodyId" (we can have an argument about os:userId, I mapped it to atom:url because the example does as well by making it a urn). This repetition is an easy source of bugs and dissonance. The cognitive dissonance comes from having to know which "id" or "updated" to look at, but duplicated data also means fragility. What if the updated timestamps are different? Which id/updated pair should I use for sync? Which title? I'm not picking on Open Social here by the way, it's a general problem with leveraging Atom.

I suspect one reason extensions get designed like this is because the format designers have their own XML (or JSON) vocabs, and their own models, and want to preserve them. Designs are more cohesive that way. As far as I can tell, you can pluck the os:activity element right out of atom:content and discard the Atom entry with no information loss, but this begs the question - why bother using Atom at all? There are a couple of reasons. One is that Atom has in the last 4 years become a platform technology as well as a format. Syndication markup now has massive global deployment, probably second only to HTML. Trying to get your pet XML format distributed today without piggybacking on syndication is nigh on impossible. OpenSocial, OpenSearch, Activity Streams, PSHB, Atom Threading, Feed History, Salmon Protocol, OCCI, OData, GData, all use Atom as a platform as much as a format. So Atom provides reach. Another is that Atom syndicates and aggregates data. "Well, duh it's a syndication format!", you say. But if you take all the custom XML formats and mash them up all you get is syntactic meltdown. By giving up on domain specificity, aggregation gives a better approach to data distribution. This I think is why Activity Streams, OpenSearch and Open Social beat custom social netwoking formats, none of which have become a de-facto standard the way say, S3 has for storage - neither Twitter's or Facebook's API is de-facto (although StatusNet does emulate Twitter). RDF by being syntax neutral is even better for data aggregation but that's another topic and a bit further out into the future.

So. Would it be better to extend the Atom Entry directly? We've had a few years to watch and learn from social platforms and formats being built out on Atom, and I think that direct extension, not enveloping, is the way to go. Which is to say, I'll take a DRY specification over a cohesive domain model and syntax. It does means having to explain the mapping rules and buying into Atom's (loose) domain model, but this only has to be done once in the extension specification, and it avoids all these "hosting" rules and armies of developers pulling the same data from different fields, which is begging for interop and semantic problems down the line.

I think in hindsight, some of Atom's required elements act against people mapping into Atom, namely atom:author and atom:title. Those two really show the blogging heritage of Atom rather than the design goal of a well-formed log entry. Even though author is a "Person" construct in Atom, author is a fairly specific role that might not work semantically for people (what does it mean to "author" an activity?). As for atom:title, increasingly important data like tweets, sms, events, notifications and activities just don't have titles, which means padding the atom:title with some text. The other required elements - atom:id, atom:updated are generic constructs that I see as unqualified goodness being adopted in custom formats (which is great). The atom:link too is generically useful, with one snag, it can only carry one value in the rel attribute (unlike HTML). So these are problems, but not enough to make me want to use an enveloping pattern.

Read: Extensions v Envelopes

Previous Topic

Next Topic


	Web Artima.com