A recent post on the Scala mailing lists stated that (as a rule of thumb) one in every ten lines of freshly written code contains a defect.
The "industry standard" is somewhere closer to 15-50 defects for every 1000 lines of code in production.
Between being written and released, code passes through pair programming, rich IDEs, static analysis, unit testing,
continuous integration, code reviews, integration testing and release candidates.
At every step more errors are detected, accounting for the difference between defects in fresh code and defects in production.
All of of these techniques are now widely used in the industry
(with the possible exceptions of pair programming and code reviews),
but they are running into a wall of diminishing returns.
It's becoming increasingly costly to add new classes of error to static analysis,
or to move code coverage from 80% to 90%. Going from 10% to 20% was much easier.
Weaving a thread throughout these numbers, one idea is pervasive;
the defect rate seems unaffected by the choice of programming language.
So 1 in every 10 lines of Scala will contain a defect, as will 1 in every 10 lines of assembly.
This is important when considering how many lines of code that each of
these languages need in order to implement the same feature.
Given that the rate of defect creation remains constant
and techniques for detection are slowing down,
it makes sense that defects should be tackled from the other direction;
instead of increasing the rate of detection we can reduce the rate of creation.
Specifically, by reducing the number of lines of code.
Modern languages have already made great progress in this area.
For example, studies comparing Scala and Java quote that the same feature requires anywhere from 3x to 10x fewer lines in the Scala implementation.
The extra lines needed in Java are just accidental complexity;
as well as providing more places for defects to appear they create more noise.
This is not good news for the developer who later has to come back and read and maintain the code and is exactly what we want to avoid... longer code with more bugs that's harder to maintain.
There are a number of features in Scala that help keep the daemon boilerplate under control.
Many of these have already been documented elsewhere, but they include:
Automatic creation of getter/setter methods for properties:
So no need to litter code with getXXX and setXXX methods
Functional constructs:
Including closures, for-comprehensions and pattern matching.
This:
List(2,3,4).foreach(println)
is shorter than this:
for(x <- List(2,3,4)) {
println(x)
}
Type inference:
Java's type system encourages duplication, which leaves open the possibility that
a type might be modified in one place but forgotten in another,
thus introducing a defect.
Instead of:
Map<Integer, String> theMap = new HashMap<Integer, String>
a Scala developer can write:
val theMap = HashMap[Int, String]
There's no need to duplicate the <Integer, String> construct.
Traits and Mixins:
Fragments of concrete code can be gathered together in a Trait, the trait can then be mixed-in when constructing another object.
This avoids a lot of code duplication when compared to implementing an interface in Java. Code duplication usually involves
copying and pasting, long known to be a cause of errors.
No checked exceptions:
You don't need to declare that you throw an IOException simply because you call another method that declares it.
All too often, a developer will catch such exceptions and silently swallow them to avoid having to explicitly add
the same declaration to callers, and callers of the callers, etc. This is a bit of a hack and usually only done
with the intention of doing the right thing at a later date, but - all too often - tomorrow never comes.
Scala code also tends to produce better quality errors.
By using Option correctly, NullPointerExceptions can be avoided.
Without checked exceptions, it's less likely that the true cause of a problem will become obsured in stack traces.
So less is more, right?
Well, sometimes... if you're reducing accidental complexity then less is definitely a good thing.
But not always; many design patterns recognised as good practice also tend to increase the line count of a system.
For example:
introducing an interface
splitting a class into several classes or a function into several functions
unit tests
using composition instead of inheritance
many of the GoF patterns
lots of Good Things(tm) will add code...
Of course, this isn't without benefit. Many patterns help to express intent more effectively and make the code more testable.
Good refactoring and splitting up large blocks of functionality will make the code easier to maintain,
and the newly-named fragements also help to document the code.
It might seem a contradiction that less lines are good and more lines are good,
but the increased line count here isn't just adding accidental complexity, it's adding structure and intent and documentation.
This all helps maintainers and testers to keep defects down, so the main goal is still being achieved!
With one exception... The use of forwarders in object composition, decorators, etc.
Take the following example (adapted from an article in Wikipedia):
abstract class I {
def foo()
def bar()
def baz()
}
class A extends I {
def foo() = println("a.foo")
def bar() = println("a.bar")
def baz() = println("a.baz")
}
class B(a : A) extends I {
def foo() = a.foo() // call foo() on the a-instance
def bar() = println("b.bar")
def baz() = a.baz()
}
val a = new A
val b = new B(a)
Here, class B implements the contract of I by delegating some of the work to an instance of A
In a worst case scenario this could lead to class B containing tens of forwarder methods that do nothing
but call through an instance of A, with hundreds of lines of code just to state that:
For any functionality not implemented in this class, delegate to the member variable "a"
it's almost as bad as javabean properties...
If "A" doesn't need any additional logic to create an instance, and it's always constructed alongside an instance of B (or some other class)
for purposes of delegation, then it can be made a trait - and the problem is solved:
trait A {
def foo() = println("a.foo")
def bar() = println("a.bar")
def baz() = println("a.baz")
}
class B {
def bar() = println("b.bar")
}
val b = new B with A
The trait A contains both the contract and default implementation for the methods
(although it could also leave some definitions abstract if desired)
Multiple traits can be mixed-in like this when constructing the value "b",
which as shown is of type "B with A"
Mix-ins help, a lot! But if "A" has to be looked up via JNDI, or needs a factory method
to construct, or already exists at the time we need to use it, then mix-ins are powerless to help.
Autoproxy is a Scala compiler-plugin created to help with exactly this situation
By using a simple annotation, the compiler can be instructed to generate
delegates in situations where mix-ins just don't help.
Returning to the original example:
abstract class I {
def foo()
def bar()
def baz()
}
class A extends I {
def foo() = println("a.foo")
def bar() = println("a.bar")
def baz() = println("a.baz")
}
class B(@proxy a : A) extends I {
def bar() = println("b.bar")
}
val a = new A
val b = new B(a)
The @proxy annotation will generate the foo() and baz() methods in class B,
identical to the hand-written versions shown previously.
Using @proxy with a trait, things become even easier:
trait A {
def foo() = println("a.foo")
def bar() = println("a.bar")
def baz() = println("a.baz")
}
class B(@proxy a : A) {
def bar() = println("b.bar")
}
val a = new A
val b = new B(a)
Behind the scenes, traits are implemented as interfaces plus a separate class containing any concrete implementation.
This means that @proxy can add A (the interface) to superclasses of B, allowing B to be used as an instance of A.
There is no need to explicitly break out I as an inteface.
The wiki and source for the autoproxy plugin can be found on github.
While I appreciate scala's ability to cut down boilerplate, arguing that by not explicitly writing setters/getters (or putting the foreach in a single line, or not duplicating the type parameters between variable declaration and instantiation) leads to fewer defects is kind of far-fetched. How many bugs have you seen in java getters lately anyway? I can't recall seeing one in years!
I've seen errors get missed in review before due to this.
I'll agree that getters and setters very rarely contain defects. More commonly, a method will be buried inside a mountain of tool-generated getters and setters, which multiply the size of the source file and obscure the really important stuff.
Nice post and great idea! I think (and this has been discussed on the mailing list already) that this delegation feature is an important feature to design systems based on components.
I hope this will make it to the main language one day.
This is really awesome! I do have one question, though. Why is it called Autoproxy and why the @proxy annotation? I think Autodelegate and @delegate would be much clearer and more appropriate names for the feature and annotation.
Either way, I will definitely be using this in the future. I always try to use composition over inheritance but always have to write all those delegation methods by hand. In Java IDE support is good enough that I can generate the delegate methods easily enough, but it does tend to obscure the bits of the class that are actually of interest to the reader. Unfortunately, IDE support for Scala isn't quite there yet (at least, not in Eclipse).
Just wonder where can I know about about this data? "The "industry standard" is somewhere closer to 15-50 defects for every 1000 lines of code in production."
For the "industry standard" I did a few google searches. This was the range that most references seemed to agree is accurate. I haven't cited any specific research as these statistics weren't really the core focus of the article.
I have Dave Griffith to that for the original 1-in-10 number, and for being the first to post on the Scala lists that reducing code size is a way to reduce defects.
This thinking was so close to my own when I conceived the plugin that I couldn't resist quoting the numbers...
This is one of my most ask questions. Basically, I wanted to use a verb and not a noun, so my options were:
- with - proxy - compose - forwardto - delegateto
@with was my favourite, because the term was already used in Scala, but also not allowed as it's a keyword. @proxy was the next shortest and I wanted to keep things concise.
I'm working here with the definition: "B proxies member a to create the delegates (or forwarders) foo and baz"
So the annotated member is not a delegate, this title is more accurately applied to the synthesized members.
Yes, this was written against 2.8 Autoproxy is a compiler plugin, an extension that adds the @proxy annotation to regular scala syntax. Is this the notation that you don't understand?
> why @proxy, why not @delegate? > > This is one of my most ask questions. Basically, I wanted > to use a verb and not a noun, so my options were: > > - with > - proxy > - compose > - forwardto > - delegateto
What's funny about this answer is that proxy is not a (transitive) verb, and delegate is. And why would "proxy" and "compose" get to stand by themselves but "forward" and "delegate" have to have "to" tacked on? They're all verbs. Why not "proxyfor" and/or "composewith" ?
> I'm working here with the definition: > "B proxies member ...
While programmers do use proxy that way, it's incorrect, at least prescriptively. Maybe programmers have been doing it long enough that we can act like it's a word. Delegate is certainly a transitive verb of the sort you want in both real life and programmerland: "B delegates to member."
The only thing I will say is that rules for what is a verb and what isn't aren't quite the same in American and British english. After years of having to use Colors instead of Colours, this is one area where I'm very defensive :)