Summary
Java has many traditional control structures, like if and for. There are also proposals, BGGA, JCA, and ARM, to add more. This post examines what new structures might be of interest and suggests that more traditional control structures would add little and instead parallel-functional structures would be more useful.
Advertisement
This post is primarily about the semantics of control structures and inner classes/closures, but also discusses some syntax aspects. Traditional control structures are constructs like if and for that are found in many languages, for example Java's are almost identical to C's. Traditional control structures are characterised by:
Not having a return value; they are statements rather than expressions.
The code blocks to be executed do not have arguments; instead new or existing variables are used (e.g. in "for ( int i = ... )" i is a new variable)
Sequential in nature; for evaluates each loop in turn.
The only non traditional control structure that Java has is ?:, which does return a value. Proposals to extend Java like BGGA and FCM + JCA suggest adding the ability to write more traditional control structures. In fact this is a major driver for the BGGA proposal which ends up limiting the power of a BGGA closure itself so that it can be used in a traditional control structure and also means that two types of closure are required, a restricted closure and an unrestricted closure. There may be a case for some extra traditional control structures, e.g. a for each loop with an index or Automatic Resource Management (ARM), but I say that traditional structures are already well covered.
Other newer languages emphasise parallel control structures and control structures that return a value, e.g. Fortress. Newer languages emphasise parallel-functional programming because this is easier on multi-core machines that are now commonplace. What I would like is the ability to write these new control structures. The inner class construct provides the ideal basis for writing parallel-functional structures. Consider a forEach loop that has an index, executes each loop in parallel, and returns the results as a map associating index with returned value.
public static abstract class ForEach<O, K, I> {
protected static final Exception CONTINUE = new Exception( "forEach Continue" );
static { CONTINUE.setStackTrace( new StackTraceElement[ 0 ] ); } // Delete irrelevant stack tracepublic abstract O block( K index, I in ) throws Exception;
Callable<O> aLoop( final K index, final I in ) {
return new Callable<O>() {
public O call() throws Exception { return block( index, in ); }
};
}
}
public static <O, K, I> LinkedHashMap<K, O> forEach( final ExecutorService pool,
final Map<K, I> in,
final ForEach<O, K, I> block ) {
final int size = in.size();
final Map<K, Future<O>> futures = new LinkedHashMap<K, Future<O>>( size );
for ( final Map.Entry<K, I> entry : in.entrySet() ) { // Execute in parallelfinal K index = entry.getKey();
final Callable<O> oneLoop = block.aLoop( index, entry.getValue() );
final Future<O> future = pool.submit( oneLoop );
futures.put( index, future );
}
final LinkedHashMap<K, O> results = new LinkedHashMap<K, O>( size );
for ( final Map.Entry<K, Future<O>> entry : futures.entrySet() ) { // Collect resultsfinal K index = entry.getKey();
try {
results.put( index, entry.getValue().get() );
} catch ( ExecutionException e ) {
final Throwable cause = e.getCause();
if ( cause != ForEach.CONTINUE ) { // Ignore CONTINUE otherwise abortfor ( final Future>O> f : futures.values() ) { f.cancel( true ); }
throw new IllegalStateException( "Exception thrown when evaluating forEach block index " +
index + " of value " +
in.get( index ), cause );
}
} catch ( CancellationException e ) { // Ignore cancelled result
} catch ( InterruptedException e ) {
Thread.currentThread().interrupt(); // Restore the interrupted status
}
}
return results;
}
Which is executed like this:
final int numProc = 2 * Runtime.getRuntime().availableProcessors();
final ExecutorService pool = Executors.newFixedThreadPool( numProc );
final String text = "*Hello*";
final int size = text.length();
final Map<Integer, Character> input = new LinkedHashMap<Integer, Character>( size );
for ( int i = 0; i < size; i++ ) { input.put( i, text.charAt(i) ); }
finalForEach<Character, Integer, Character> trim = newForEach<Character, Integer, Character>() {
public Character block( final Integer index, final Character in ) throws Exception {
if ( index <= 0 || index >= size - 1 ) { throwCONTINUE; }
return in;
}
};
final Map<Integer, Character> output = forEach( pool, input, trim );
out.println( output );
pool.shutdownNow();
and gives the expected result:
{1=H, 2=e, 3=l, 4=l, 5=o}
The important thing demonstrated is that inner classes have the ideal semantics for parallel-functional structures, but not the ideal syntax (see below). The semantics of the BGGA style closures are not ideal for this type of control structure because you need to use a restricted closure (not an unrestricted) and you have to be careful not to inadvertently box this closure inappropriately (e.g. into a function type). The scope of variables etc. is not ideal with a BGGA style closure either. This is not to say that something similar cannot be written in BGGA, just to say that it will be worse than what we can currently write. This begs the question of why add a BGGA style closure when the existing inner classes are better. The BGGA and JCA proposals also have further syntax support for calling control structures, but unfortunately this can only be used with traditional control structures.
As mentioned above, there are some syntax improvements that can be made in terms of calling the new control structure, forEach. The declaration of the forEach method itself could also be shorter; but since the method is called more often than it is written, this is less important. The inner class trim, that is the block of code executed in parallel by forEach, is the main area where syntax improvements can be made. All the inner class/closure proposals, C3S (this is my suggestion!), CICE, BGGA, and FCM, all provide shorter syntax for inner classes/closures, e.g. using C3S syntax the call to forEach and the inner class trim could be written in one line as:
final Map<Integer, Character> output = forEach( pool, input, method( index, in ) {
if ( index <= 0 || index >= size - 1 ) { throw CONTINUE; }
return in;
} );
The primary syntax support provided by C3S is the keyword method that makes an instance of an anonymous class and overrides the only abstract method in its super class and also infers the super-class type, method to be overridden, argument types, return type, and exception types. This short syntax for defining an inner-class instance helps a great deal with the readability of the code and I think some form of short syntax for inner classes would be a useful addition to Java.
C3S also provides further, secondary, syntax support and the line can be written as:
final output = forEach pool, input, method( index, in ) {
if ( index <= 0 || index >= size - 1 ) { throw CONTINUE }
in
};
This secondary syntax support provided by C3S is:
Infers the type of output
Brackets, (), are not needed for method calls provided that it is not ambiguous.
The semicolon before a close brace, }, is not needed.
return is optional at the end of a method.
I think this extra secondary support for inner classes and method calling is worth having, but the gain is not as great as that provided by short syntax for inner-class instances.
This post has demonstrated that the ideal semantics for new control structures are those of the inner class and that the BGGA and JCA proposals do not give us a useful new construct because they are biased both in terms of their closure semantics and syntactic support to traditional control structures. Java has a good selection of tradition control structures and I contend that what is really needed is the ability to write parallel-functional control structures; further I suggest that inner classes with shorter syntax are ideal for this purpose, perhaps with further short syntax for calling these structures. What do you think?
If java just had a decent closure syntax, there would be no need for parallel control structures. The operations could simply be expressed as library functions:
The only new control structure that java needs is a RAII-like 'using' statement (http://en.wikipedia.org/wiki/Resource_Acquisition_Is_Initialization) which is useful in a GC'd language for ensuring that resources with a stack-delimited lifespan are cleaned up properly. That would eliminate a lot of try/finally boilerplate.
I don't think that a normal closure is ideal. The lexical scope is too limited. In particular you cannot inherit; in the example I inherits from ForEach. Therefore inner classes are a superior construct.
That would be a map function that executes in parallel and retains the order of the original map handles exceptions and implements continue. I suggest the amount of code is quite modest for that functionality.
> @Carson, > > I don't think that a normal closure is ideal. The lexical > scope is too limited. In particular you cannot inherit; in > the example I inherits from ForEach. Therefore inner > classes are a superior construct.
_shrug_
I don't think there is much value to inheritance in most algorithmic decomposition: you just have a little flit of logic somewhere in a loop that you want to generalize (map, sortBy, filter, etc.) and there is no point in having much lexical structure around it beyond the normal semantics closures provide (variable capture from the local scope + parameters of the closure.)
Therefore closures are a superior construct. (<-- Am I doing this right?)
Additional control structures add complexity to a language and need one hell of a justification to get in. I don't see much above, given that closures can perform the same job, but tastes may vary.
What ever you can do with a closure you can do with an inner class, they both capture the enclosing scope. The inner class also captures the inherited scope, therefore you can do more with an inner class. In a strictly mathematical sense, rather than in a in my opinion sense, therefore an inner class is more powerful than a closure.
You see this in the example were ForEach has members CONTINUE and aLoop that are used by the inner class (block trim) and by the forEach method. Sure you could make CONTINUE and aLoop public top level constructs in a name space of their own, e.g. in a class called ForEachExtras, but then you have to refer to them as ForEachExtras.CONTINUE, etc. Yes it can be done, but it isn't as good!
What is more, Java already has inner classes, so why add something that is less powerful and behaves differently than the existing construct. At best adding closures will be confusing and at worst it will be an endless source of hard to find bugs.
Maybe I am wrong. Maybe you can do better with closures. Can you suggest some code? The example was trivially easy to write, it took about 1/2 an hour, so it is not too much to ask for a closures version.
> Maybe I am wrong. Maybe you can do better with closures. > Can you suggest some code? The example was trivially easy > to write, it took about 1/2 an hour, so it is not too much > to ask for a closures version.
Well, I don't find your example particularly compelling, so let's take an example I do: you want to map an ArrayList of employees to their names. Here's the closure version:
return _employees.map( \ emp -> emp.Name )
And, if you really want this to be parallelized, here's the parallel version:
That took me about 10 seconds to write, and I'm pretty stupid.
(I'm, of course, cheating and using GScript again, which is our internal JVM language and I'm inventing the asParallelList() method, which should and will exist as soon as Doug Lea's parallel data structures make it into the JVM.)
But you haven't shown anything! Your examples would be exactly the same with inner classes (assuming inner classes had the same syntax as your closures and the classes had the same methods).
How about coding the example I gave which included a continue and exception handling. You can code against the same libraries that I used - they are already in J6. You got my code as a starting point!
This style using an outer class, ParallelMapper, is a good idea for a BGGA solution since it lets you use inner classes to control the scope. But I would contend that this is more complicated than just using inner classes and not as general as the original since it only supports whereKey and not a general continue mechanism.
So having wrote a very good BGGA approach, do you think it has advantages over the inner class approach?
I think that closures are good where you need to supply small pieces of code to specialize/parametrize bigger algorithms written in the form of libraries. They take the burden out of the library user and put it on the library writer. I agree that writing a good library, where it's public API is expecting closures, is a demanding task. Using such a library is where closures shine!
As far as the original example goes, I would prefer something like this which eliminates the static method and uses composition instead of inheritance. I think it makes the code a lot cleaner and easier to follow / use.
import java.util.LinkedHashMap;
import java.util.Map;
import java.util.concurrent.Callable;
import java.util.concurrent.CancellationException;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
publicclass ForEach<O, K, I>
{
publicfinalstatic Exception CONTINUE = new Exception("forEach Continue");
static
{
CONTINUE.setStackTrace(new StackTraceElement[0]);
}
privatefinal ExecutorService pool;
privatefinal Block<O, K, I> block;
public ForEach(ExecutorService pool, Block<O, K, I> block)
{
this.pool = pool;
this.block = block;
}
public ForEach(Block<O, K, I> block)
{
finalint numProc = 2 * Runtime.getRuntime().availableProcessors();
this.pool = Executors.newFixedThreadPool(numProc);
this.block = block;
}
private Callable<O> aLoop(final K index, final I in)
{
returnnew Callable<O>()
{
public O call() throws Exception
{
return block.get(index, in);
}
};
}
publicvoid cleanUp()
{
pool.shutdownNow();
}
public LinkedHashMap<K, O> execute(final Map<K, I> in)
{
finalint size = in.size();
final Map<K, Future<O>> futures = new LinkedHashMap<K, Future<O>>(size);
for (final Map.Entry<K, I> entry : in.entrySet())
{
final K index = entry.getKey();
final Callable<O> oneLoop = aLoop(index, entry.getValue());
final Future<O> future = pool.submit(oneLoop);
futures.put(index, future);
}
final LinkedHashMap<K, O> results = new LinkedHashMap<K, O>(size);
for (final Map.Entry<K, Future<O>> entry : futures.entrySet())
{
final K index = entry.getKey();
try
{
results.put(index, entry.getValue().get());
}
catch (ExecutionException e)
{
final Throwable cause = e.getCause();
if (cause != CONTINUE)
{
for (final Future<O> f : futures.values())
{
f.cancel(true);
}
thrownew IllegalStateException("Exception thrown when evaluating forEach block index "
+ index + " of value " + in.get(index), cause);
}
}
catch (CancellationException e)
{
/* Ignore cancelled result */
}
catch (InterruptedException e)
{
/* Restore the interrupted status */
Thread.currentThread().interrupt();
}
}
return results;
}
publicstaticvoid main(String[] args)
{
final String text = "*Hello*";
finalint size = text.length();
final Map<Integer, Character> input = new LinkedHashMap<Integer, Character>(size);
for (int i = 0; i < size; i++)
{
input.put(i, text.charAt(i));
}
final Block<Character, Integer, Character> trim = new Block<Character, Integer, Character>()
{
public Character get(final Integer index, final Character in) throws Exception
{
if (index <= 0 || index >= size - 1)
{
throw ForEach.CONTINUE;
}
return in;
}
};
ForEach<Character, Integer, Character> forEach = new ForEach<Character, Integer, Character>(trim);
final Map<Integer, Character> output = forEach.execute(input);
forEach.cleanUp();
System.out.println(output);
}
}
interface Block<O, K, I>
{
O get(K index, I in) throws Exception;
}
We are talking past one another a bit right now. You are talking primarily about the implementation side. I was looking at your function from the user side, and it looks pretty awful to me.
I'm pretty indifferent towards the implementation side: smart kids like Doug Lea can make due with the tools available to them today to implement parallel data structures. This goes back to the original point: there isn't any need for syntactic support to help them.
What is necessary is decent closure support to make accessing the parallel behavior of these data structures painless for the end users. Average java developers aren't going to need a slew of new parallel control structures if they have easy to use parallel map, sort, filter, etc methods easily accessible off of their data structures. By composing these parallel methods together, average developers can take advantage of parallelism in a syntactically painless manner when it makes sense.
You want to write parallel algorithms. I want to take advantage of already written parallel algorithms because I'll screw writing them up.
> Howard, > > We are talking past one another a bit right now. You are > talking primarily about the implementation side. I was > looking at your function from the user side, and it looks > pretty awful to me.
I tried to address that in my revision of the initial code. It's equivalent other than moving to a more OO approach.
> You want to write parallel algorithms. I want to take > advantage of already written parallel algorithms because > I'll screw writing them up.
I don't see that in his example but maybe you are getting that from somewhere else. All the tough parts handled by the thread pool and other classes from the JDK.
Flat View: This topic has 32 replies
on 3 pages
[
123
|
»
]