This post originated from an RSS feed registered with Python Buzz
by Ng Pheng Siong.
Original Post: From Snownews To BottomFeeder
Feed Title: (render-blog Ng Pheng Siong)
Feed URL: http://sandbox.rulemaker.net/ngps/rdf10_xml
Feed Description: Just another this here thing blog.
I got the itch to try
BottomFeeder
out seriously. I've been using
Snownews; "wc -l"
reports that my Snownews subscription list is 229-strong. Certainly I'm not about to
resubscribe to each of these by hand in BottomFeeder.
BottomFeeder supports importing and exporting feeds. Supposedly it imports
from the following file types: BTF, RSS, XML, ini and text. I have no idea
what these formats are, so I took the simplest approach: subscribe to a
single feed, export it and examine the exported file visually... Ah, the
export file is in something called
OPML - Outline Processor Markup Language.
Next, googled "snownews opml" and found the following:
Both are Perl programs. The first uses HTML::Entities and the second
XML::LibXML and Data:Dumper. It's been a long time since I last installed
anything via CPAN and I didn't feel like trying again right then.
Went back to inspecting the exported OPML file... the XML looked simple
enough... there, cooked up a chunk of Python code to generate each outline
item:
import sys
TEMPLATE = """\
<outline description="..." htmlUrl="" language="en" text="" xmlUrl="%s"/>
"""
while 1:
line = sys.stdin.readline()
if not line: break
print TEMPLATE % line
There were two more steps before the generated output was usable by
BottomFeeder:
» Snownews' 'urls' file stores one feed per line; each line contains the
feed's URL and a bunch of meta-information, separated by "|". Cooked up a quickie awk script to print just the URL; these URLs are the input to the above Python code
» After running the Python code, filled in the OPML template information by hand.
(Incorporating these two steps into the Python code is left as an exercise for the reader. ;-)
And that was it. BottomFeeder imported the feed successfully and is
presently crunching thru the feeds doing whatever it is supposed to be
doing...