Summary
SableCC is a compiler generator written in Java. Similar to ANTLR, SableCC provides an object-oriented framework to parse and build an abstract syntax tree from source files. In a blog post, Martin Fowler introduces SableCC, and provides a few examples of writing grammars for SableCC.
Advertisement
Similar to other compiler compiler frameworks, such as ANTLR and JavaCC, SableCC provides an object-oriented interface to a parser and compiler generator. SableCC is based on the master's thesis of its author, Etienne Gagnon, and is available under the LGPL license.
According to the project's documentation, SableCC's design is based on two design decisions:
Firstly, the framework uses object-oriented techniques to automatically build a strictly-typed abstract syntax tree. Secondly, the framework generates tree-walker classes using an extended version of the visitor design pattern which enables the implementation of actions on the nodes of the abstract syntax tree using inheritance.
Martin Fowler recently took SableCC for a test drive, and reported back on his findings in his blog HelloSablecc.
SableCC is a bit awkward to use. There's little documentation, other than the author's thesis. Fortunately the thesis is much more understandable than many others I've come across so I was able to figure out how to get things going. During my work I made a mistake in the grammar and found it tricky to get diagnostics. The error messages weren't too informative and I resorted to debuggers and print statements inside the generated parser code... ANTLR scores rather better on this front. Recursive descent parsers are easier to follow and there is an ANTLR book in the works which should help me a lot as I explore that.
Fowler notes the manner in which SableCC creates parse trees, one of the framework's distinguishing characteristics:
[Defining the grammar] doesn't say how to get from the input to my configuration and item objects. In order to do this I need to write some code to map between what I've parsed and the objects I want to create. In most compiler-compilers I do this by embedding actions into the grammar. SableCC, however, works another way. It automatically creates a parse tree and then gives me a visitor to walk this parse tree. I can then subclass the visitor to do interesting things. In this case, as I walk the parse tree, I take each item node on the parse tree and turn it into the real items in my model.
This sort of extended visitor pattern keeps the parser code separate from code that needs to operate on parse tree nodes, but Fowler was not convinced that this approach is better than the more conventional approach:
So far I'm not convinced by the approach of removing parser actions and automatically generating a parse tree. Since it's a parse tree, you have to walk it to do anything useful. Actions can create a more abstract representation than a parse tree, which are easier to work with. ANTLR has some interesting looking features for that which I'll be exploring. One plus in SableCC's approach is that if I'm making changes to the tree walker I don't need to re-generate—which keeps me in IntelliJ. However ANTLR has ANTLRWorks which plugs into IntelliJ and looks very nice.
What Java parser have you found the easiest to use for code generation?
Hi I'm writing an interpreter and i choose which compiler to use just some weeks ago. I analyzed SableCC and JavaCC and choose the second.
Note: I have no problem using LR or LL parsing.
Here are the reasons of my choice when i take it:
-JavaCC can produce abstract syntax using jjtree but it dosen't force do so. You can write you action in a quite simple way.
-SableCC produce code for jdk1.5, it uses annotatins while JavaCC let you choose jdk version. This is important for me because i plan to use gcj that doesn't support 1.5.
-JavaCC comes with nice examples, very very useful, and a lot of valuable LL grammar. Also SableCC has some grammars but i can't find examples.
-There are few informations about speed of produced code but it seams JavaCC could produce faster code. Actually I'm not sure of this.
My experience with JavaCC after some weeks is quite good. JavaCC and jjtree have a nice integration in Eclipse. It can compile the grammar with one key, and it is quite smart in generating files. Use Eclipse debugger on them is straight. Also imho compliler complier that allow me to write actions is a more handy and valuable tool. I started writing the grammar with actions but it soon became difficult to mantain (i was developing the grammar..). Then i passed to jjtree, without using visitor pattern (that is an option). jjtree grammar is more polite, it generate classes for productions but you can choose if generate a class for a production or no, and its name (that means not the same of production). Also all generated nodes of ast inherit from SimpleNode.java. Editing it you can add methods to dump the tree.
Imho the point is: Want to write a simple language: you should prefer write your own actions. Want to write something complex: use an ast generator, probably you will need to do type-checking on it, then build another(final?) ast with your own nodes(classes).