This post originated from an RSS feed registered with Java Buzz
by dion.
Original Post: Hpricot is great
Feed Title: techno.blog(Dion)
Feed URL: http://feeds.feedburner.com/dion
Feed Description: blogging about life the universe and everything tech
I used to cringe at having to work with XML. These days there are nice ways to work with it... from E4X to Groovy builders, and of course with Hpricot.
I wanted to take my OPML file and grep out the URLs so I could create a custom search engine that would search over my buddies (from the OPML file).
It is basically a one-liner with Hpricot:
require 'rubygems'
require 'hpricot'
filename = ARGV.first || 'mysubscriptions.opml'
doc = open(filename) { |f| Hpricot(f) }
(doc/"outline[@htmlurl]").each do |url|
puts url.attributes['htmlurl']
end
In my case the OPML file is just sitting on disk there, but I could easily have it grab the file from a URL: