The Artima Developer Community
Sponsored Link

Java Buzz Forum
The XML Instance Gamut

0 replies on 1 page.

Welcome Guest
  Sign In

Go back to the topic listing  Back to Topic List Click to reply to this topic  Reply to this Topic Click to search messages in this forum  Search Forum Click for a threaded view of the topic  Threaded View   
Previous Topic   Next Topic
Flat View: This topic has 0 replies on 1 page
Wilfred Springer

Posts: 176
Nickname: springerw
Registered: Sep, 2006

Wilfred Springer is a Software Architect at Xebia
The XML Instance Gamut Posted: Oct 19, 2009 12:44 AM
Reply to this message Reply

This post originated from an RSS feed registered with Java Buzz by Wilfred Springer.
Original Post: The XML Instance Gamut
Feed Title: Xebia Blog » Wilfred Springer
Feed URL: http://blog.xebia.com/author/wspringer/feed/?category=java
Feed Description: Crazy ideas. Read this blog at your own risk.
Latest Java Buzz Posts
Latest Java Buzz Posts by Wilfred Springer
Latest Posts From Xebia Blog » Wilfred Springer

Advertisement

If you happen to be in the business of writing software serving XML documents or consuming XML documents - and if you read this post, then there is a fair chance you are - then there is always one big challenge: how do you make sure your service or client is capable of dealing with all of the XML documents you could possibly expect to be passed around?

And if you happen to come from the test-driven world, the answer is obviously: by testing it. However, if you try to do that, things might be harder than you expect at first.

What about schemas?

I clearly remember having to integrate with Google's Local Search Service. We managed to get them send us their schema, but the schema was merely illustrative, rather than normative. In fact, it didn't even 'parse' correctly. It was supposed to be a DTD, but in reality, it wasn't. In that case, you are basically lost. The only thing that you can really do is 'test by poking around', trying to see what the web service is going to reply, and then work into your test harness.

If you do however manage to get a schema, then you are still not done yet. Sure, if it's about SOAP based web services, then you might be able to generate stubs and skeletons, and those stubs and skeletons would give you some guarantee that you are covering most cases. But then there is still a chance that you would not cover for all cases, since - inside your XML document - there might be alternatives for content models, and you might - when you would implement your service - only be dealing with one of them.

If the schema is small, then you can probably figure it out by careful examination. However, if the schema is huge, then the range and variety of XML document instances that you might get will make that impossible. And even if you created the schema yourself, it might sometimes cover for a wider range of options than you expected. (I'm sure, I am not the only one who experienced this. ;-) )

XML Instance Generator to the rescue

So, back to test-driven. The good news is, there are tools that take a schema, and generate random instances, basically walking all of the different options. Xmlgen is one of those tools. It's a little bit hard to find these days. If you follow the 'XML Instance Generator' link on Kohsuke's homepage, you will end up in no-mans land. I dug a little further, and found out it's currently hosted at Sun's dev.java.net.

Xmlgen is extremely simple. It takes a schema (any schema language), and will generate any number of sample documents from that. It's exactly what you want, except… It doesn't support all datatypes defined by the XML Schema Datatypes specification. And that's something I ran into more often before.

In fact, I tried to use xmlgen before on a couple of occasions, and each time it broke on missing support for xs:dateTime or xs:pattern restrictions. And there doesn't seem to be an aweful lot of work going into xmlgen to fix that.

Fixing XML Instance Generator

So I figured I'd fix this myself. It turned out adding support for dateTime wasn't all that hard, even though xmlgen does not really have extensions points to implement, so you're basically left with a) hacking the source code big time, or b) hacking it just a little, in order to add plugpoints and then have something else implementing that plugpoint - which is what I did.

Whoops, xs:pattern

Adding support for xs:pattern turned out to be a little tricky. If you are new to this type of restriction, then you should know that it is about restricting content to fit a certain regular expression, as illustrated below.

<simpleType name='better-us-zipcode'>
<restriction base='string'>
<pattern value='[0-9]{5}(-[0-9]{4})?'/>
</restriction>
</simpleType>

Now, if you would have the desire to generate valid data for this restriction, then you should be able to generate text from that regular expression. It turns out there are quite a few Java libraries out there capable of matching text, but there nothing at all for generating text. So I implemented my own. I blogged about it here, and it is hosted here.

Once that was done, extending xmlgen to have support for xs:pattern restrictions was easy. That means that - with just a few changes - I am now able to generate a test set for a fairly complicated schema. And I'm pretty sure that it will cover all cases, as long as I make the number of instance documents big enough.

So, now for a restriction like this:

<xsd:simpleType name = "TimeValue">
<xsd:restriction base = "xsd:string">
<xsd:pattern value = "[0-2][0-9]\:[0-5][0-9](\:[0-5][0-9])?"/>
</xsd:restriction>
</xsd:simpleType>

… it will generate instances like this:

  • 07:36
  • 10:16:26
  • etc.

You can download the modified version of xmlgen here.

Read: The XML Instance Gamut

Topic: Intel profit falls 8 pct but chip maker offers better-than-expected guidance, shares leap Previous Topic   Next Topic Topic: Microsoft fixes record number of security flaws, including some for upcoming Windows 7

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use