This post originated from an RSS feed registered with Python Buzz
by Jarno Virtanen.
Original Post: HTTP decency in Python
Feed Title: Python owns us
Feed URL: http://sedoparking.com/search/registrar.php?domain=®istrar=sedopark
Feed Description: A weblog about Python from the view point of Jarno Virtanen.
The little Python related stuff I've done recently has involved
fiddling around with Mark Pilgrim's Feed Parser
and decent
behavior of HTTP clients and its accompanying test
cases. Yeah, it's not rocket science, and relatively painless
since Feed Parser already does most of the work (using urllib2).
Read through Feed Parser's source code in case you need assistance
implementing these yourself.
The thing is, nowadays every other fiddler is implementing her own
RSS/web/whatever scraper and 'em scrapers need to behave decently in
the world of HTTP. Remember that.