I'm having a love/hate relationship with libxml2 right now.
I'm currently facing the job of repeatedly parsing multiple XML documents that are around 9-12 MB is size. Don't ask who's creating them - it only makes me angry.
Anyway - PyXML seems to choke on files this size. Memory usage goes through the roof (I killed Python when it hit 400+MB) and it's just too dang slow.
Enter libxml2 from the GNOME folks.
Parsing a 10MB XML file on the Powerbook takes about 4 seconds. XPath queries are a little wonky in Python still - doc.xpathEval("//FOO") doesn't seem to work, but doc.xpathEval("//*[name()='FOO']") does.
Weird.
Anyway - each bit of the Python bindings for libxml2 API I figure out is followed by both elation (ooh - so fast....) and irritation (why isn't this documented better?).
I'm going to have to see if there's a way to generate a comprehensive API docs for this thing.