Summary
Should interfaces between network-bound software components be designed with human readability in mind? That's a question eBay architect Dan Pritchett ponders in a recent blog post, inspired by frustrations with WSDL.
Advertisement
Interface definition languages, or IDLs, are typically designed to facilitate communication between software components across the network. IDL code is often generated from a full programming language with automated tools, with the result that the generated code is often hard for humans to read.
Some developers and tools vendors argue that hard-to-read IDL code is fine, since IDLs were never meant for consumption by developers in the first place. For example, most Java IDEs support the generation of Java code from a WSDL-defined Web service as well as the generation of WSDL from a Java interface or class.
If tools can understand WSDL, is human readability of IDL code, such as a WSDL file, still important?
In a recent blog post, WSDL - Why Services Don't Launch, Dan Pritchett argues that it is. Inspired by the frustrations of a WSDL project, Pritchett notes that even tools have a hard time deciphering and correctly implementing WSDL constructs:
We have four entities, CRUD on 3 of the entities and two operations on the fourth. Conceptually the interface can be described in five minutes. It can be expressed rigorously in any OOP language in about an hour...
Enter WSDL. The conceptual interface was translated into WSDL which took more than a day. Once the WSDL was finally validating, we were able to generate code in Axis. Then we moved on to C# and GSOAP. Neither of them would work without further modifications to the WSDL. Another day lost on compatibility. Once the servers were deployed, we ran into issues where GSOAP generated code that compiled but didn't work. There were name space challenges. What took the engineers an hour to express in Java was taking days to express in WSDL. And I want to reiterate that this is a relatively simple interface.
Latent in Pritchett's post is the observation that translating a conceptual service into WSDL is harder than defining that conceptual interface in a programming language. One reason is that WSDL code was not meant for human consumption:
On the subject of human readability, WSDL fails miserably... It is similar to reading XML schemas, only harder. Some of you may decide at this point that I'm not much of a software engineer if I find XSD and WSDL difficult to read, and so be it. But I can read a DTD and RELAX NG specifications with ease. I would expect any specification that purports to be a mechanism to allow developers to communicate interface semantics to be clearly understood by developers and not force them to rely upon tools to translate into languages they know.
"So what", some will argue. WSDL is about allowing tools to generate interfaces and is not intended for human consumption anyway. I'll argue that it lacks sufficient constraints to allow that to work well either...
I dare to boldly state that WSDL is an impediment to building services. I'm sure the intentions were good and honorable but the result misses the mark. There is a next generation WSDL in the works but it doesn't appear the primary goal is to improve either of these issues. If anything, I am concerned that 2.0 will get further from human readability than 1.1.
So what makes a good IDL? Pritchett has a few suggestions:
Provide syntax and semantics that are more readily understood by developers. I think XML is a reasonable tool for such an IDL but the focus should be on readability. Think RELAX NG vs XSD and you have the idea.
Support a variety of interactions... It should also be possible to describe REST interactions with this IDL.
Provide more control over the transport binding. The primary transport in use today is HTTP. My IDL would allow me to leverage all HTTP operations, specify headers that are meaningful as well as the content-type of the body...
Don't preclude code generation. There is nothing wrong with code generation, I just feel that it should go from the IDL to the implementation and not vice versa...
What's on your list of requirements for a good IDL?
Dan Pritchett is quite right to question the effectiveness of WSDL as an IDL. Interfaces are not just for code generators: they form the focus of discussion and design. They play a role in communication between developers, so readability is not just some optional afterthought: it is essential.
A textual notation that is inconvenient for people defeats many of the benefits of having a textual in the first place. If machine readability is all that counts, there are far better and more compact approaches. If human readability and writability matter, I wouldn't start from XML. Although in theory applications of XML are supposed to be human readable, it appears that in practice this is true for a relatively small class of humans. WSDL is, of course, not the only offender, and many DSLs that are XML based just demonstrate how easy it is to make simple tasks look more complicated than they really are.
CORBA IDL is not perfect, but is less imperfect than WSDL. The design of CORBA IDL was based on familiarity of syntax for the primary class of users (C and C++ programmers) and previous experience with the less readable DCE IDL. Applying the same principles now, over a decade and a half on and with more curly-bracket languages in widespread use, suggests something of a similar form would still be a good starting point for many IDLs.