Summary
Dick Wall's talk turns out to be a treasure trove of useful tidbits, and a great introduction to Scala that whets my appetite, big time.
Advertisement
Dick Wall's talk was a revelation. The title was "Funky Java,
Objective Scala". It sounded intriguing, but I didn't really
know what to expect. It turned out that his talk was a treasure trove of useful
tidbits that turned out to be a great introduction to Scala.
Fortunately, I had recently read John Hughes' seminal paper, Why Functional Programming Matters, originally published in
1990.
That paper did a really nice job of presenting two
significant benefits of functional programming, in a slow, easy-to
follow way:
Lazy Evaluation:
Let's say that function e() operates on entries produced by
function g(). Take a game, for example. The function g()
generates moves, while function e() evaluates them. But the
game allows for an infinite number of moves. Pretty darn hard
to generate them all and then evaluate them. But the evaluation
function e() has cutoff threshholds. It knows when a move is"good enough". Lazy evaluation says to run e() whenever
there is something to run on, and run g() one more step only
when e() needs something else to process. Then you don't care
if g() runs to infinite depth, because e() will find something
acceptable and get back to you sometime before the end of the
century.
Higher-Order Functions:
We know that modularity is a good thing. And we know that
functions are useful. But when functions are "first class
citizens" (which means you can pass one as an argument), then
whole new realms of modularity open up. For example, having
written one function that operates on entries in a list, you
can pass other functions to it in order to do different things
with the entries in that list. The wrapper function is now a "higher order" function that can be configured with worker-functions to do different kinds of stuff.
The result is a tribute to the inductive hypothesis. Having
tested that a simple function passed to the list is properly
applied to the members, and having tested that the function
you pass is correct, you can be sure that result of processing
will be correct. (That's the inductive hypothesis. If we can
prove that proposition P is true for N+1 whenever P is true for N,
and we can prove that P is true for 1, then we know P is
true for all numbers. It's the same principle, applied to a list.)
Dick Wall's Talk
That paper turned out to be just the background I needed to fully
appreciate Dick's talk. He started out by describing the advantages
of functional programming and the situations in which it's helpful.
Then he showed how to do functional programming in Java. That was
educational in itself. But then he went a step further. He showed
how much simpler the code was in Scala, and rattled off a list of
Scala features that I typically associate with dynamic languages
like Ruby.
To say I was intrigued is to put it mildly. The list of features
missed a couple of incredibly powerful things that stem from
fully dynamic classes, but at the same time those features can
be dangerous, and they can make programs harder understand. So overall, it began to look as though Scala figured out how to strike a really useful balance between flexibility and readability.
Situations that Call for Functional Programming
When I was in school, I loved the fact that programming "recipes" (algorithms) gave me ways to solve problems that weren't well-suited ot a mathematical approach. Years later, object-oriented programming turned out to be a terrific tool for dealing with even more complex problems.
But as Dick pointed out, there are still many mathematical/scientific problems that are more easily expressed using combinations of mathematical functions, rather than algorithms and objects. So while objects and algorithms have been terrific for one class of problems typical of ecommerce, GUIs, and even development tools, it has been less than stellar in other arenas--arenas in which "functional decomposition" is an appropriate solution mechansim.
Dick's list of situations the benefit from the functional approach included:
Parallelism: One of the hallmarks of the functional approach is that functions have no side effects. They return a value, but internal state never changes. Because of that invariance, making things operate sequentially simply isn't as critical as it is with an algorithmic approach. And since order is close to inconsequential, many things can happen in parallel.
Science and Math Applications: As Dick pointed out, "functional composition" is second nature to mathematicians, and even to scientists. So while it seems weird to programmers at first, it's their natural way to do things. So when you are getting information on how to solve a problem, you're likely to get it functional form. And when you're describing how your program works, you're likely to be better understood if you're talking in functional terms.
Lots of nested if's, Massive for-loops, Monster methods, Flags for tracking state: These "code smells" suggest a level of complexity that doesn't break down well using the conventional object-oriented, algorithmic approach. Functional decomposition might be just the tool you need to pry the problem apart into manageable pieces.
Anything other than e-commerce: When you're doing network traffic analysis, for example, it might just be that mathematical functions are the ideal way to size up the situation and make real time routing decisions.
Functional Programming in Java
The next set of slides showed how to do functional programming in Java. The slides were too far away to read easily, but here's the gist of things:
A class that holds state has private final variables. Once assigned, they're fixed. So once created, the state of that object never changes.
To make a "change" you make a copy of the object with state variables modified.
To get an instance of the class, you use a Builder. The builder takes the state you want to set as arguments and returns an object for it.
[Note:
For reasons that weren't quite clear, the builder had a chain of constructors, such that the first called the second, and so on, with a different state value passed as an argument in each call.]
Take advantage of a variety of Java extensions:
Guava - a Java collection library that provides predicates, functions, and constraints. (Although it's missing the map reduce operation which does sophisticated list processing, Lisp style.)
Typesafe Pair<> and Triple<> tuples that let you pass back multiple results from a function, which is frequently desirable with a functional approach.
Project Lombok - Which automatically adds getters and setters, as well as the equals and hashcode methods needed for comparisons and collections ordering.
LamdaJ and JSR166y extras - Which have map.reduce.
Parallel Array - A function that does a forks and joins under the covers to kick off a plethora of parallel processes.
The point here is simply that it's possible to do functional programming in Java. (In the talk, Dick showed a simplified version of a chromosome-analyzing risk-assessment algorithm, and showed how it fell apart rather nicely using functional decomposition.)
Advantages of Functional Programming
When you're used to it, according to Dick, and using it where it's appropriate, functional programs are:
Easier to write
Easier to read
Easier to debug
Easier to test
That last one really caught my attention, because I am a testing maniac. I've coded without tests, and I've coded with them. Coding with a comprehensive collection of tests is way more fun. And it's safer. Programs stay up longer, and when they do crash, they tend to avoid the ultimate sin of losing user data.
When Dick said that functional programs are easier to test, it really rang a bell. Because the first thing I need to do to make code testable is typically to break it up into teeny modules that can be independently tested. Therein lies the path to coding nirvana.
So right about this point in the talk, several things came together: My experience with testing, Dick's comments, and the description of higher order functions in John Hughes' paper. The paper has a marvelous series of examples where you create a list-processing function that takes other functions as an argument. One takes a "sum" function as an argument, and sums the list. Another takes a "product" function, and multiplies the list elements together. He goes on to give several more examples--all of which are variations on the map reduce theme.
The important thing, to circle back to a point made earlier, is that once you know the meta-function works properly on lists of arbitrary size, you only need to add a test for the new function you're adding (sum or product, for example). And all you have to prove with those tests is that the "seed" of the induction hypothesis holds true. You already know that the hypothesis holds for N+1 if it holds for N. All you have to prove is that it does indeed hold for N.
For example, you only need to prove that the sum function produces the correct result for two numbers. It's not hard to do that in a fairly comprehensive manner, given that the task is so small. You care what happens when there is a 1 and a null, or a null and a one, or a null and a null. Maybe half a dozen cases in all. You don't care what happens when there is a null in the beginning of a list, a null at the end, or if there are nulls scattered throughout the list. You don't have to test the list-processing code at the same time as the value-summing code. You divide and conquer.
As an additional benefit, debugging becomes easier, too, because you have a large collection of teensy functions. When any function has a bug, there is only a small amount of code to inspect. If function A combines functions B & C, and B and C are both known to work, then the bug has to be in the way A combines them. At any point in time, you're only looking at small amount of code, so debugging is drastically simplified.
Testing Java with Scala
Highly modular programs are easier to test, and Scala's capacity to create DSLs makes it possible to define a testing language that suits the problem. That's the same sort of thing that the Ruby community has been so succesful with. But since Scala works so well with Java, Dick pointed out that Bill Venners was able to make a tidy little DSL wrapper for JUnit and TestNG called ScalaTest.
So while highly functional Scala progams would be the ultimate in testability, it's possible to enjoy the goodness of Scala to make it easier to write tests for new and existing Java code--and to make those tests easier to read, as well.
(see Inset: ScalaTest Your Java Code)
ScalaTest Your Java Code
ScalaTest is modeled after my favorite Ruby test harness, RSpec.
The ability to do testing in the “Behavior Driven Development”
(BDD) style that RSpec allows is important, in my view, because the
resulting “specs” can be easily read and reviewed--a highly
significant feature that is often overlooked when the benefits
of testing are enumerated.
http://www.scalatest.org/
--ScalaTest can be used to test Java classes
--Shows a readable, runnable test/spec in the BDD style
http://www.scalatest.org/quick_start
--Mentions that it works with JUnit and TestNG
--Mentions that the BDD style shown on the first page is a "trait" that can be enabled or disabled
--Mentions the ability to do functional and integration testing, as well
Scala Features
Having shown that functional programming is a valuable tool for a variety of problems, and having shown that it can be accomplished in Java, Dick Wall went on to show how much easier and simpler the code was in Scala. (It was too far away to read, but there was less of it!)
He cited these advantages of Scala, in particular. (He didn't have time to go into them so the comments that follow them are mine.)
Both Object-Oriented and Functional
It's not Lisp. And it's not Java. It's a really nice blend that lets you choose the approach you need for the problem at hand.
Fully integrated with Java.
Import Java libraries with virtually identical syntax. Use them the same way. Compile to Java classes.
Tuples
It's a highly useful construct that is used often, most frequently to return multiple values from a function. Apparently it is so useful that NetBeans has no less than 13 different varieties of typesafe Tuples.
Pattern Matching
Pattern matching is built in. That's handy, because it's useful. And having it built in keeps the syntax clean.
Closures
Closures are a marvelous thing. I got used to them in Ruby. You pass a block of code to a list, and it gets executed on every member of the list. For example: someList.each {|item| your code here}. Forget loops. You focus on the logic you want to execute for each item in the list--not the loop-processing logic.
DSLs
Scala makes it easy to write DSLs--both internal and external, using slightly different approaches. Any single-argument function can be used as an infix operator, functions can be named using symbols, and parentheses and dot-operators are optional. So you can write object.>=('a'), or object >= 'a', where >= is defined in whatever way makes sense, using the same pattern you use for other functions: def =< (arg) {...}
Operator overloading
The >= function can be defined using speed for sports cars, hauling capacity for trucks, and weight for Sumo wrestlers.
Traits
A trait is like a class. It has fields and methods. But while it can't be instantiated, it can be included in any class that needs those operations.
Case Classes
This is a fascinating capability that I need to explore more, but early indications are that I'm going to love it. A "case class" is a class that can have multiple constructors, each with a different name. The processing for that class can now ask, "what type of myself am I?" and do the right thing, depending on the answer. To make it concrete, it would be possible to define a Vehicle class with constructors for car, boat, motorcycle, bicycle, and truck. Common methods like slow down and accelerate could be shared, but they could behave differently at times, too. (That is highly unlikely to be the best possible example. But it seems reasonable at the moment.)
Implicits
Scala's answer to Ruby's open classes. The ability to specify a wrapper type for a class, and automatically convert references to the original type into instantiations of your wrapper type. So if you create SuperString, all references to String s are autoconverted to SuperString(s). Too cool. Apparently, it also works when a String argument is passed to a method that specifies SuperString as the parameter type.
Also: Default values for missing arguments. Another useful feature that comes in surprisingly handy when constructing DSLs.
And those are just the features he mentioned! Once I started looking around, I came across quite a few more:
Built-in XML handling
Given the prevalence of XML for data storage and interchange, that is a way good thing.
Built-in BNF handling
Great for building external DSLs.
All the cool types: List, HashMap, TreeMap, LinkedHashMap, Queue, Set, Stack, Tuple.
Rich Tree Processing mechansims.
TreeMap and Case Classes, for example, which turn out to be terrific for syntax trees.
Function "currying"
Iif x(a,b) is a function, than x(a,b,c) must mean that x(a,b) returns a function, so apply the returned function to c.
List comprehensions
(Something for the Pythonistas--algorithmically generated lists. You specify the algorithm, you get a list.)
So What's Missing in Scala?
With all of that functionality in place, what exactly is Scala missing? So far, I've
only found a few things:
Missing-method Handlers
Many Ruby DSLs do a lot of their processing when you invoke a
method that isn't there. You add a handler and "do the right"
thing, using the name of the method as your guide, and the
fact that you were able to pass a block of code as an argument--a block of code that can easily contain calls to other methods
that aren't there.
For example, that kind of handler makes it easy to create a "language" that generates XML. You called foo() and there is
no method by that name? That must mean you should start an XML element
by that name, so output <foo>. Now execute the code-block
that was passed to foo(), if any, and generate anything you find
in there. When you're done doing all of that, generate </foo>.
Voila! Instant XML generation without having to pre-define
a method for every tag you might want to generate.
Open Classes
(Not a missing feature. Covered by Scala implicits My thanks to the
folks who clued me in.) Open classes let you add features at runtime.
That feature is kind of like a chain saw, actually. It's really
powerful. But it can also rip your leg off. And it's tricky to
maintain. (What is that class doing *now*?) It's the kind of
thing that would be driving programmers nuts by the end
of a 20-year maintenance cycle. But it's a godsend when the
developers of the original library weren't prescient enough
(or didn't have the time) to develop everything you need.
Pattern-Matching Operator
Scala has a cool match/case mechanism that looks like the Java
switch statement, only with regular expressions. But so far, I
haven't seen the simple match operator common to Icon and Ruby.
In Ruby, the expression syntax is: someString =~ /regularExpression/
The expression returns true if the string matches the regular
expression, false otherwise. It's not as powerful as the match/case
statement, but it's convenient and concise.
Of course, Scala makes it pretty easy to define such an operator,
and use it the way you would expect to. (Any single-argument function
can be used as an infix operator, function names can be symbols, and parentheses and dot-operators are optional. So the syntax comes out exactly the way you want to see it.)
Definition:
class regexTest(String s) {
String local_s = s
def @= (regEx re) {
match local_s {
case re: true
case _: false
}
}
}
Like a true functional language, the last expression to be evaluated is returned. So if the match succeeeds, true is returned. Otherwise, false.
Usage:
regexTest(someString) @= /regularExpression/
You then use an "implicit SuperString" definition to make it available in every reference to the String type.
Bottom Line
Ok. Scala is missing a couple of Ruby's cool features. But it's not missing all that much. And it has a lot going for it. (For other takes on the subject, see the Inset.)
"10-100 times faster...(with) built-in syntactic support for XML"
"Scala makes it easy (and intuitive) to write functional style code."
That's important because "Much of Ruby's beauty and productivity comes from its functional side"
"Much better facilities for writing DSLs. Scala's syntax is more flexible and expressive that Ruby's"
"Scala's parser library lets you have BNF-style expressions
right in your code"
"Implicits are perhaps the biggest powerhouse towards designing
user friendly DSLs. (They) offer a safe way to extend existing
abstractions. Implicits in Scala is perhaps one of the best
examples of a meaningful compromise in language design between
uncontrolled open classes of Ruby and the sealed abstractions
of Java...."
"...And the best part is that the entire extension...is lexically
scoped and will only be available within the scope of the
implicit definition function."
http://escalatesoft.com
A venture partnered by Dick Wall and Bill Venners,
devoted to teaching Scala. They have free intro sessions
from time to time. (Subscribe to their RSS feed to find out
about them.)
In a nutshell, IMO, the beauty and features of the actual language will not matter that much. Effective and simple integration with affordable GPUs will. If, for example, Fantom does CUDA better than Scala, Fantom will win. Geez, if FORTRAN does CUDA better than Scala FORTRAN may make a comeback. :-)
> The problem then becomes the lack of open classes. If you go to the trouble > of defining the operator, you want to apply to ''every'' string, > because you don't know in advance when you're going to need it. > And you don't want to retrofit every string to a > subclass when you find out that you do.</p>
Use implicits - specifically, read about how to "pimp" classes.
case class PimpedString(s: String) { def =~(regex: String): Boolean = ... } implicit def string2pimpedString(s: String) = PimpedString(s)
Well... "Missing-method Handlers" and "Open classes" hardly can be labeled as "omission" in Scala language -- Scala is statically-typed language, but both features you mentioned are possible only with dynamically-typed languages.
Dick's list of situations the benefit from the functional approach included:
*
Parallelism: One of the hallmarks of the functional approach is that functions have no side effects. They return a value, but internal state never changes. Because of that invariance, making things operate sequentially simply isn't as critical as it is with an algorithmic approach. And since order is close to inconsequential, many things can happen in parallel. *
Science and Math Applications: As Dick pointed out, "functional composition" is second nature to mathematicians, and even to scientists. So while it seems weird to programmers at first, it's their natural way to do things. So when you are getting information on how to solve a problem, you're likely to get it functional form. And when you're describing how your program works, you're likely to be better understood if you're talking in functional terms. *
Lots of nested if's, Massive for-loops, Monster methods, Flags for tracking state: These "code smells" suggest a level of complexity that doesn't break down well using the conventional object-oriented, algorithmic approach. Functional decomposition might be just the tool you need to pry the problem apart into manageable pieces. *
Anything other than e-commerce: When you're doing network traffic analysis, for example, it might just be that mathematical functions are the ideal way to size up the situation and make real time routing decisions.
Parallelism can be achieved just as easily in object oriented languages by using the Active Object pattern.
Functional decomposition can exist in procedural and object oriented code as well. It can even exist in assembly.
Tracking state (Lots of nested if's, Massive for-loops, Monster methods, Flags for tracking state) is bad programming and can happen in functional code as much as in procedural code. It has nothing to do with if code is functional or not.
All the mainstream imperative languages have functions, so they are suitable for encoding mathematical functions in them.
Let's get not curried away with what functional programming can do. It has its advantages, but let's not twist reality here.
> If I understand correctly, most of the use cases for Open > Classes can be achieved using implicits in Scala. > That definitely appears to be the case. My whirlwind introduction to Scala missed that. I misunderstood the term to mean default parameter values. The example posted earlier in this thread made me realize that they are WAY more useful than that. So I'm pretty ecstatic. (I also need to correct the article.)
[Comment on another thread: Of course Scala is static and Ruby is dynamic, and that's why Ruby can do a couple of additional things that are interesting. My point was about available functionality, not implementation methodology. I'm certain that Ruby programmers will be interested to learn that (a) there isn't much missing and (b) what those things are. Combined earlier discussions on readability, maintainability, and tooling, the tradeoff looks pretty favorable.]
> I misunderstood (implicits) to mean default parameter values. > Searching on the term, I find the first several listings are all about implicit parameters. So I didn't misunderstand. But a few items down, alongs comes this beautiful post that shows Ruby hackers how to do implicit type conversions in Scala. Sweet! I am impressed. http://paulbarry.com/articles/2009/04/17/implicit-conversions-scalas-type-safe-answer-to-rubys-open-class
> Use implicits - specifically, read about how to "pimp" > classes.
Implicits != open classes at all. You can create a new view of an existing object, but you can't extend its behavior or state using implicits. Scala has (or had) other ways of encoding open classes, but implicits is not the way.