This post documents one way to generate an RSS feed - by iteratively building up an XML document, using the basic XML framework in VisualWorks. That's pretty much where I started 3 years ago, when I first started creating a feed for this blog. I pretty much left that code alone, since it works, and I felt no compelling reason to revisit it.
Later on, I wanted to create local (i.e., file urls) RSS feeds scraped from websites that don't support syndication. This time around, I looked into using a SAX driver. The basic class I created is RSSSaxWriter:
Smalltalk.RSSSax defineClass: #RSS_SAXWriter
superclass: #{XML.SAXWriter}
indexedType: #none
private: false
instanceVariableNames: ''
classInstanceVariableNames: ''
imports: ''
category: 'StoreRSS'
I created specific subclasses for the various RSS versions I wanted to support (although in practice I really only use RSS 2.0). The easiest way to see how I use this is to look at a script I wrote to scrape a cartoon I read.
| contentBlock out writer rest str content |
contentBlock := [:builder :chunk |
| lnk |
lnk := 'http://www.comics.com', chunk.
builder link: lnk.
builder title: 'Monty: ', Core.Date today printString.
builder description: '<img src="', lnk, '">'.
builder pubDate: Core.Timestamp now].
out := 'monty.xml' asFilename writeStream.
[writer := RSS20_SAXWriter new output: out.
writer prolog.
writer startRSS.
writer startChannel.
writer title: 'Monty'.
writer link: 'http://www.comics.com/comics/monty/index.html'.
writer description: 'Monty'.
writer pubDate: Core.Timestamp now.
writer startItem.
writer title: 'Monty: ', Core.Date today printString.
content := 'http://www.comics.com/comics/monty/index.html' asURI valueStream contents.
str := content readStream.
str throughAll: '<IMG SRC="/comics/monty'.
str upToAll: '<IMG SRC="/comics/monty'.
str throughAll: 'IMG SRC="'.
rest := str upToAll: '"'.
contentBlock value: writer value: rest.
writer endItem.
writer endChannel.
writer endRSS]
ensure: [out close].
Fairly simple to follow - and a whole lot easier than building up an XML doc from scratch. Ignoring the scraping bits, you just tell the writer what the values are for various tags, making sure to start/end various sections (like 'channel'). To get an idea what's behind some parts of that, let's look at a couple of the SAX driver methods. Here's the #startRSS method:
.
startRSS
self
startElement: 'rss'
attributes: (Array with: (XML.Attribute name: 'version' value: self version)).
self cr.
All that is, is a convenience method around the general #startElement:attributes method (there's a simpler #startElement: for cases where you don't have any attributes to worry about):
startElement: localName attributes: someAttributes
self
startElement: ''
localName: ''
qName: localName
attributes: someAttributes.
The rest looks a lot like that - to see the implementation, load package RSSScriptRunner from the Public Store - it has all the code for this. The upshot is, you can use a SAX driver to easily create an XML document, and creating a new SAX driver isn't hard - you simply start with the framework class SAXWriter and customize.