This post originated from an RSS feed registered with Java Buzz
by Michael Cote.
Original Post: Namespace URI Lookup in Ruby
Feed Title: Cote's Weblog: Coding, Austin, etc.
Feed URL: https://cote.io/feed/
Feed Description: Using Java to get to the ideal state.
I've been slowing learning Ruby over the past month or so. Tonight I was playing around with using REXML to parse out RSS feeds, and I started getting concerned about getting the correct prefix for namespace'ed elements -- like the dublin core crap.
As far as I know, you can't gurantee that the prefixes used will be the same. So, you can reliably get dublin core dates with something like element.each("//dc:dates"). The dc part might change document per document.
So, I was looking for a way in XPath or Ruby (REXML, I guess) to figure out the correct XPath to use. I didn't find much (though this seemed to indicate how you'd use XPath, but it didn't seem to work for me), so I tried my dumb-hand at writing some actual Ruby code to create a hash of namespace URI's to their prefixes:
require 'rexml/document'
xmlresp = File.open("bushwald.rss")
xml = REXML::Document.new(xmlresp)
defaultNS = xml.root.namespace('')
ns = xml.root.namespaces
prefixes = xml.root.prefixes
# remove the default, so lists
# are equal
ns.delete(defaultNS)
nsHash = {defaultNS => ''}
ns.each_index{|i|
uri = ns[i]
prefix = prefixes[i]
nsHash[uri] = prefix
#puts uri+" = "+prefix
}
#print it out
nsHash.each{|key, value|
puts key+" = "+value
}