This post originated from an RSS feed registered with Java Buzz
by Paul Brown.
Original Post: Second Impressions of StAX
Feed Title: mult.ifario.us
Feed URL: http://feeds.feedburner.com/MultifariousCategoryJava
Feed Description: Software. Business. Java. XML. Web Services.
I blogged on StAX back on 2003-09-29 when the specification was up for public review, and as a follow-up Tetsu Miyamoto and I did some basic experimentation. (The overall results and a bunch more example code will appear in Javaworld later in the year.)
Not Quite There Yet
All of the currently available StAX parsers are at an alpha stage, and that means that using them in production is not recommended.
The reference implementation (included with the specification download) works out of the box.
After a fix for a missing class (thanks to the Chris Fry at BEA for a quick fix), the BEA preview implementation works as well.
I was not able to get the Oracle implementation to work, as some of the classes appear not to implement the required interfaces. [After talking with some people at Oracle, it turns out that their preview release is built against an earlier version of the specification, thus the java.lang.NoSuchMethodErrors when running with code built for the later version.]
Of the two working implementations, both have issues with various features and functionality but nothing that would prevent a potential user from digging in and experimenting.
Basic Performance Test
By imitating the SAX test harness, Dennis Sosnoski's XMLBench package can be used to benchmark the StAX processors. The fundamental rule of benchmarks applies: Your mileage will vary.In summary, the StAX stream API is roughly as fast as the SAX API. In terms of the minimum average parse time per document, StAX averaged 134% and SAX averaged 155% of the minimum average parse time per document. This makes sense, as the API is essentially hooks into the underlying character stream.For the two implementations tested, the StAX event API required between 2x and 3x as much time as SAX to traverse a document in the reference implementation and between 4x and 9x as much time for the BEA preview. Both implementations are early, and both likely suffer from unnecessary object creation overhead. (For example, the javax.xml.namespace.QName objects passed out of the parser are different even for the same QName.) I expect that this proportion will decrease as the implementations mature.
Programming and Ease-of-Use
The StAX stream and event APIs require somewhat different programming styles, and the event API could even be thought of (and will probably be implemented) as a wrapper on the stream API. The stream API is a thin wrapper on the input stream that stops after each unit of markup is parsed, and the event API is equivalent except that it encapsulates that unit of markup in an object.Overall programming convenience is good. For example, getting the text content of an element requires only a single method call on an XMLStreamReader.Here's a simple example that uses the event API with two streams to add the body into a SOAP envelope. The first step is all setup:
XMLOutputFactory xof = XMLOutputFactory.newInstance();
XMLEventWriter xsw = xof.createXMLEventWriter(new PrintWriter(System.out));
XMLInputFactory xif = XMLInputFactory.newInstance();
FileInputStream envStream = new FileInputStream("xml_files/soap_envelope.xml");
FileInputStream bodyStream = new FileInputStream("xml_files/soap_body.xml");
XMLEventReader env = xif.createXMLEventReader(envStream);
XMLEventReader bod = xif.createXMLEventReader(bodyStream);The next step is to iterate through the stream that contains the events for the envelop
1000
e and then merge in the second stream when the SOAP:Body element is encountered:
QName bodyQN = new QName("http://schemas.xmlsoap.org/soap/envelope/",
"Body");while (env.hasNext()) {
XMLEvent xe = env.nextEvent();
xsw.add(xe); if (xe.getEventType() == XMLEvent.START_ELEMENT &&
((StartElement)xe).getName().equals(bodyQN))
{
boolean doOutput = false;
XMLEvent ev2 = null;
// scan past the prologue
do {
ev2 = bod.nextEvent();
} while (ev2.getEventType() != XMLEvent.START_ELEMENT); while (ev2.getEventType() != XMLEvent.END_DOCUMENT) {
xsw.add(ev2);
ev2 = bod.nextEvent();
}
bod.close();
bodyStream.close();
}
}
env.close();
envStream.close();xsw.flush();
xsw.close();This is doable but much less convenient in SAX.