This post originated from an RSS feed registered with Python Buzz
by maxim khesin.
Original Post: delicious-py slightly broken
Feed Title: python and the web
Feed URL: http://feeds.feedburner.com/PythonAndTheWeb
Feed Description: blog dedicated to python and the networks we live in
I was staying up last night implementing my next del.icio.us experiment, only to find that an essential (for this experiment) part of delicious-py, get_posts_by_url() is broken. It's not that surprising, as it is one of the DeliciousNOTAPI set of functions, which are basically HTML scrapes and subject to breakage by Joshua and Co any time.
Still I was feeling kind of down, as one of my favorite toy's legs were falling off.
Well, it was late enough and I said what the heck, I can do it myself. I took a look at the original code and it was using the (in)famous SGMLParser override technique. I hate it (and wish Mark would stop teaching it!) as it forces you to keep context state inside weird member variables. Just take a look at HtmlToPosts class in the original delicious module. I guess it's SAX vs DOM thing...
I much prefer the search + hierarchical navigation approach of BeautifulSoup. In case anyone needs this, here is my version of get_post_by_url.
I am kind of starting to wish BeautifulSoup would make it into the basic Python distro. That's a lot of batteries (in a small package, to boot) that should be included.