The Artima Developer Community
Sponsored Link

Python Buzz Forum
Post-parsing thoughts

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Phillip Pearson

Posts: 1083
Nickname: myelin
Registered: Aug, 2003

Phillip Pearson is a Python hacker from New Zealand
Post-parsing thoughts Posted: Oct 2, 2003 6:32 PM
Reply to this message Reply

This post originated from an RSS feed registered with Python Buzz by Phillip Pearson.
Original Post: Post-parsing thoughts
Feed Title: Second p0st
Feed URL: http://www.myelin.co.nz/post/rss.xml
Feed Description: Tech notes and web hackery from the guy that brought you bzero, Python Community Server, the Blogging Ecosystem and the Internet Topic Exchange
Latest Python Buzz Posts
Latest Python Buzz Posts by Phillip Pearson
Latest Posts From Second p0st

Advertisement
OK, now I've shown how to parse RSS-Data and how to parse namespaced RSS extensions in Python. I'm still not convinced that it's a good idea. Parsing the RSS-Data was just barely easier than parsing the straight XML, but only because xmlrpclib's return codes are more familiar to me than elementtree's tree objects. I would argue that the time spent on hooking the XML-RPC and RSS parser in your language of choice together would be better spent on writing an XML library like elementtree.

There's a good discussion going on re my original post.

---

Roger Benningfield claims that the RSS-Data one is more useful by default than the straight XML one, but I'd argue with this. For example, here are two bits of XML, which I assert are equally useful without any specific support from an aggregator:

1. RSS-Data

<x:container xmlns:x="http://www.myelin.co.nz/ns/x">
 <sdl:struct>
  <sdl:member>
   <sdl:name>foo</sdl:name>
   <sdl:value><sdl:string>bar</sdl:string></sdl:value>
  </sdl:member>
 </sdl:struct>
</x:container>

This deserialises to {'foo': 'bar'} in Python, and we know that instructions for understanding this information are at http://www.myelin.co.nz/ns/x.

2. Straight XML

<x:container xmlns:x="http://www.myelin.co.nz/ns/x">
 <x:foo>bar</x:foo>
</x:container>

This parses into an elementtree which is equivalent to {'foo': 'bar'>, and we still know that instructions for understanding this information are at http://www.myelin.co.nz/ns/x.

---

Georg Bauer suggests a sensible application for RSS-Data: using the RSS feed to push generic data (that the RSS feed generator doesn't understand) back to a periodically-connected client (like Radio, or Georg's PyDS) to process.

This sounds very sensible -- in this case, you can't use a well-specified XML encoding, because the RSS writer doesn't have a clue what it's writing.

When you are talking about well-specified data -- say, recipes, reviews, events, or music -- however, there's no necessity to use a general encoding like this.

Comment

Read: Post-parsing thoughts

Topic: PC games just suck Previous Topic   Next Topic Topic: Don't use the timestamp in RSS feeds to sort the entries

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use