Summary
Last week I released a new version of ScalaTest (0.9.5) that includes a "matchers DSL" for writing more expressive assertions in tests. In this post I show differences between ScalaTest matchers and those in Ruby's RSpec tool, and discuss some of the general differences in DSL creation in Ruby and Scala.
Advertisement
Dynamic languages such as Ruby and Groovy have a reputation for enabling "internal" domain specific language (DSL) creation, but internal DSLs are not only a feature of dynamic languages. Although Scala is statically typed, its flexible syntax is quite accommodating to internal DSLs. However, the different languages place different constraints on DSL design.
Invoking methods without parentheses (or dots)
One feature of Ruby that helps in DSL creation, for example, is that you can leave off parentheses when you invoke a method. For example, given a string in Ruby:
>> s = "hello"
=> "hello"
You can determine whether it contains a substring like this:
>> s.include?("el")
=> true
Or, alternatively, by leaving off the parentheses, like this:
>> s.include? "el"
=> true
Scala does not let you leave off the parentheses in the same way:
scala> val s = "hello"
s: java.lang.String = hello
scala> s.contains("el")
res5: Boolean = true
scala> s.contains "el"
:1: error: ';' expected but string literal found.
s.contains "el"
^
However, Scala supports an "operator notation," which allows you to leave off both the dot and the parentheses:
scala> s contains "el"
res6: Boolean = true
By contrast, Ruby does not support this kind of operator notation:
>> s include? "el"
(irb):21: warning: parenthesize argument(s) for future version
NoMethodError: undefined method `include?' for main:Object
from (irb):21
Adding methods to existing classes
Another feature of Ruby that facilitates DSL creation is its open classes, which among other things allows you to add new methods to existing classes. For example, class String in Ruby has no method named should:
>> "".should
NoMethodError: undefined method `should' for "":String
from (irb):1
Nevertheless, here's how you could, using open classes, add a method named should to class String in Ruby:
>> class String
>> def should
>> "should was invoked!"
>> end
>> end
=> nil
Now you can invoke should on Ruby String:
>> puts "".should
should was invoked!
=> nil
Scala, being statically typed, doesn't support open classes. The methods supported by a class are fixed at compile time. However, Scala's implicit conversion feature provides much the same benefit, allowing you to write code in which it appears you are invoking new methods on existing classes. For example, because Scala's string is java.lang.String, you can't invoke should on it:
scala> "".should
:5: error: value should is not a member of java.lang.String
"".should
^
Nevertheless, you can define an implicit conversion from String to a type that does have a should method. The Scala compiler will apply the implicit conversion to solve a type error. Here's how you could define the implicit conversion:
scala> class ShouldWrapper(s: String) {
| def should = "should was invoked on " + s
| }
defined class ShouldWrapper
scala> implicit def convert(s: String) = new ShouldWrapper(s)
convert: (String)ShouldWrapper
Given this implicit conversion, you can now write code that appears to invoke should on a string:
scala> "howdy".should
res10: java.lang.String = should was invoked on howdy
Behind the scenes, the Scala compiler will implicitly convert the String to a ShouldWrapper, and then invoke should on the ShouldWrapper, like this:
scala> convert("howdy").should
res11: java.lang.String = should was invoked on howdy
Comparing Matchers DSLs
Ruby's RSpec tool includes a matchers DSL, that allows you to write assertions in tests that look like this:
result.should be_true # this is RSpec
result.should_not be_nil
num.should eql(5)
map.should_not have_key("a")
One thing to note is that Ruby's convention of separating words with underscores helps make these expressions read more like English. Between each word is either a space, underscore, or dot. In Scala, you could use operator notation to get rid of the dot, yielding expressions like:
result should be_true // Could do this in Scala
result should_not be_null
num should eql(5)
map should_not have_key("a")
The problem is that this use of the underscore is not idiomatic in Scala. Like Java, Scala style suggests using camel case, which would yield expressions like:
result should beTrue // Could do this in Scala
result shouldNot beNull
num should eql(5)
map shouldNot haveKey("a")
This works, but is not quite as satisfying, because the words do not separate as nicely in camel case compared to underscores. When designing a matchers DSL for ScalaTest, I decided to try and see how far I could go with operator notation. The corresponding expressions in ScalaTest are:
result should be (true) // This is ScalaTest
result should not be (null)
num should equal (5)
map should not contain key ("a")
The parentheses on the rightmost value are not always required, but the rule is subtle, so I recommend you always use them. The parentheses also serve to emphasize what is usually the expected value. Here's how one of these expressions will be rewritten by the Scala compiler, when it desugars the operator notation back into normal method call notation during compilation:
result.should(not).be(null)
The should method is invoked on result (via an implicit conversion), passing in the object referred to by a variable named not. Then be is invoked on that return value, passing in null. In other words, in this expression, operator notation is used twice in a row.
Conclusion
When designing an internal DSL, you don't have as much freedom as when you design an external DSL—i.e., a new language from scratch. With an internal DSL you need to work within the confines of the host language, and so will your users. In RSpec's matchers, for example, users need to keep track of where to put dots, underscores, and spaces. Similarly, in ScalaTest matchers, users need to keep track of where to put parentheses. In both cases, the syntax is nevertheless quite easy to learn, and the resulting code is quite readable.
It is interesting to notice that Python has operator overloading,and so writing DSL is perfectly possible, just as in Ruby and Scala. However, the community is not fond of magic, and usually prefer to use plain old Python syntax for everything. I must cite this spectacular recipe about infix operators, though: http://code.activestate.com/recipes/384122/
> It is interesting to notice that Python has operator > overloading,and so writing DSL is perfectly possible, just > as in Ruby and Scala. However, the community is not fond > of > magic, and usually prefer to use plain old Python syntax > for > everything. > There is I think a readability tradeoff involved with internal DSLs. I find that they can make code more readable in the sense of quickly understanding the intent of the programmer. But they can make code less readable in the sense of understanding how the code actually implements that intent, i.e., how it works. So long as the code is working, and you don't need to know the implementation details, I think it helps productivity by making the intent of the programmer easier to see. But if something goes wrong and you have to dig into how it works, it could slow you down. Usually these things tend to work as advertised, so on the whole they are a net productivity win.
People complain about Java being verbose, and I think one part of that verbosity is that Java always spells out how things are working. The first matcher DSL I remember hearing about was Hamcrest in Java, which lets you say things like:
assertThat(result, is(not(equalTo(null))));
This is also readable as an English statement, kind of, but it is clearer at the same time what is a method invocation and what are parameters being passed.
I don't want to sound like a jerk, but this comparison seems to be more about how to make Ruby and Scala look like each other, not a comparison about how suitable either language is for making a clear, understandable DSL. Though I would argue that comparing them in this manner leads me to believe that both do the job quite nicely.
It looks like you and Ola Bini have had conversations in the past. Perhaps you could write a similar article about the suitability of Ioke for DSLs. Its run-time malleability and message-based execution seem to make it as suitable a candidate as Ruby.
> I don't want to sound like a jerk, but this comparison > seems to be more about how to make Ruby and Scala look > like each other, not a comparison about how suitable > either language is for making a clear, understandable DSL. > Though I would argue that comparing them in this manner > leads me to believe that both do the job quite nicely. > Yes, I think both do the job nicely. My main goals here were in general, to demonstrate that internal DSL creation does not require dynamic typing, and in particular, to show you can do this kind of thing in the statically typed language Scala. That's all.
> There is I think a readability tradeoff involved with > internal DSLs. I find that they can make code more > readable in the sense of quickly understanding the intent > of the programmer. But they can make code less readable in > the sense of understanding how the code actually > implements that intent, i.e., how it works. So long as the > code is working, and you don't need to know the > implementation details, I think it helps productivity by > making the intent of the programmer easier to see. But if > something goes wrong and you have to dig into how it > works, it could slow you down. Usually these things tend > to work as advertised, so on the whole they are a net > productivity win. > I was talking to a programmer at Twitter last nigh (where they use Scala for some stuff), and he had another take on high level code. He basically said that the tradeoff with high level code is that although it is easy to gather the intent of the programmer, it is not as easy to guess the performance of it. So I was thinking you'd really only need to know the how if things go don't work, but he pointed out you also want to know the how if things work but are too slow. And I think he meant more as he's writing it, he's not as sure of how it will perform.
Bill Venners wrote > I was talking to a programmer at Twitter... said that the tradeoff with > high level code is that although it is easy to gather the > intent of the programmer, it is not as easy to guess the > performance of it.
I've had a similar experience with highly-layered and abstracted C++ code using templates. I saw 18 months of work thrown away because the performance was inadequate and the guy responsible had no idea how to take his layers apart to work out how to fix it.
I can't remember why his work didn't come under any review but it was the kind of organisation where if people didn't put their work forward for discussion, they got left to pursue their own approach.