Summary
In a previous weblog a suggestion for reifying lambda types was given. The previous weblog give examples but no formal description of the process. This weblog gives a more formal description.
In a previous weblog a proposal for reified lambdas was given, http://www.artima.com/weblogs/viewpost.jsp?thread=277879. This weblog builds on the previous weblog and assumes familiarity with the previous weblog. Where the details present in this weblog contradict details in the previous weblog, this one supersedes the previous. In effect this weblog is an improved version of the previous.
Expanding on the previous weblog was suggested by Neal Gafter (http://mail.openjdk.java.net/pipermail/lambda-dev/2010-January/000218.html). This weblog is deliberately written in a 'dry' style and doesn't introduce lambdas or justify reification of them, other than section 1.1 (See previous weblog for an 'easier' read and discussions of these topics.)
The writing style is not saying 'we could do X', instead it says 'do X', this is purely a writing convenience and is not meant to imply that this is in any way an accepted proposal. The proposal is simply a proposal and has no formal backing from Sun (or anyone else for that matter).
The text below is not strictly a formal description of the processes, but it is hoped that it is sufficiently detailed to tease out any problems. Since the aim of this post is to tease out problems, feedback is most welcome.
At this point in time the following issues are not addressed, they are open issues, and a brief note about there status given. These open issues are put into three groups of ones that are specific to this specification, ones that might influence the choice of one implementation over another, and ones that are largely orthogonal to the implementation. This categorization is open to interpretation and you could argue about the placement of an issue within in a particular group.
In this weblog simple type names are used, e.g. Integer, in practice a fully
qualified
name is required, e.g. java.lang.Integer. Unfortunately you cannot have '.'
in a type name, therefore java\,lang\,Integer could be used instead.
Similarly array types need to be specified and '[' and ']' are also
problematic in type names and therefore int[] could become int\{\}.
These character translations, and more, are suggested by John Rose
(http://blogs.sun.com/jrose/entry/symbolic_freedom_in_the_vm).
'_', which I have used to separate parts of type names, is not
appropriate because it is already used for inner classes; in practice you could
use '+' instead (to indicate an automatically generated/added type).
Thanks to Maurizio Cimadamore
(http://mail.openjdk.java.net/pipermail/lambda-dev/2010-January/000258.html)
for reminding me of the NextGen compiler that uses a similar
solution and to Alex Blewitt (private email) for specifically pin pointing the problem.
Contravariantly-overriding wrapping
To support type safe operations like #Object(String)sO=#String(Objects)(s.toString());
lambdas are automatically wrapped (also see below but one for a separate use of contravariantly-overriding wrapping).
However Java doesn't ordinarily support contravariantly-overriding, e.g.
interfaceOS{Objectm(Strings);}...interfaceSOextendsI{@OverrideStringm(Objecto);}
is illegal. Should contravariant-overriding be supported?
Contravariantly-overriding arrays of lambdas
As the specification stands you can write
#Object(String)[]sOs=new#String()[]{#String(Objects)(s.toString())};.
However in Java you can't say long[]ls=newint[]{1};. Should this be allowed for lambdas?
Lambda cast
The contravariantly-overriding wrapping is also used in a similar manner to promoting-conversions applied automatically
to primitives, e.g. promoting int to long. For primitives there is also a reverse operation via a down-cast, e.g.
i=(int)l;. Currently there is no equivalent down-cast, there should be!
Cannot serialize a lambda, use a named-inner class.
(Serialization is hard to do for something anonymous and therefore best not supported.)
Class implementing a lambda or extending a lambda
Cannot write classCIimplements#int(int) or classCEextends#int(int).
(This is closely related to the next point, SAMs, and can probably be handled
by the same mechanism.)
Single abstract method (SAM)
Cannot write Comparable<Integer>cI=#int(intother)(...);.
(I think the From mechanism, described below, can probably be adapted for SAMs.
It is desirable that a lambda can be wrapped as a SAM and it is desirable that it can
be unwrapped, not supporting unwrapping is like boxing without unboxing!)
It would be nice if a lambda could be called like a method, e.g. lambda(...);
in this proposal no short-syntax is discussed and the example would be
lambda.call(...) for a non-primitive lambda and lambda.callPrimitive(...)
for a primitive-lambda.
(This may well be the final solution, since a method of calling lambdas
with short syntax has yet to be found that does not have undesirable side-effects.)
Lambda syntax
The strawman syntax without type inference is used throughout this specification.
Optimizing lambda creation
If a lambda is used in an inner scope it is natural to define it in an inner scope,
however this can be sub-optimal. For example if the lambda does not capture local
variables or instance fields it could be defined statically. The compiler could,
'promote', the lambda to an outer-scope static automatically.
(Probably OK to leave as is at the moment, since the same comments apply to
inner classes and it is up to the programmer to do the 'promoting'.)
Shared writable variables
The strawman discusses the possibility of having shared writable variables,
this proposal does not address this issue since it is orthogonal to the implementation.
(Could use the Wrapper mechanism described below to allow the variables
to be shared, which would require the Wrapper interface to be generic (which
is not a problem).)
Exception transparency
Cannot be used, must wrap in an unchecked exception.
(Probably OK to leave this out - i.e. it is OK to wrap checked exceptions in
an unchecked exception.)
The following types are added by the class loader to java.lang as needed (the compiler internally also generates these classes but does not emit these classes). The reason these classes are dynamically generated by the class loader is twofold:
To prevent an 'explosion' of types that results from reifying lambdas. (If lambdas are erased then only the generic interfaces are needed and up to 5 arguments, say, could be supported directly as public interfaces java.lang and hence dynamic generation would be unnecessary.)
To ensure that all lambdas share the same underlying classes and are hence interoperable. (If the compiler emitted the types then each compilation unit would have its own lambda types and hence not interoperate.)
I haven't spelled out how the loader (or compiler) generates the required classes in detail, but two short paragraphs are given in the next two sub-sections on this topic. The reasons for not spelling this out is to save space (it is reasonably obvious what to do). The processes is to generate the type from its name, the name encodes all the information needed to generate the type.
Generic interfaces are used when any types involved are generic and as a base type for the reified types. These interfaces are interfaces that define a generic-abstract call method with n arguments and follow the naming convention _Callable_n, e.g.:
public interface _Callable_0<R> { R call(); }
public interface _Callable_1<R, A1> { R call(A1 a1); }
When the loader sees in a class file the interface _Callable_n it first checks to see if it already has this interface and if not generates the appropriate interface. The name _Callable_n gives the loader all the information it requires (which is simply n, the number of arguments). The name _Callable_n is unique, since other _Callable_ types (described below) do not contain the integer n since an integer is not a valid type name.
The following description of dynamic classes uses '...' to indicate part of a name that is dependent upon the type of the lambda, e.g. #Object(String) has types Object and String and '...' is replaced by Object_String.
When the loader sees in a class file the class _Callable_... it first checks to see if it already has this class and if not generates the appropriate abstract class. The name _Callable_... gives the loader all the information it requires (in particular the '...' part encodes the reified types needed). The reified _Callable_... names are distinguished from generic _Callable_n because n (an integer) is not a type name.
Reified, non-primitive, abstract classes are used when there are no generic types and no primitive types. These classes are abstract classes that define a reified-abstract call method via implementing the equivalent generic _Callable_n classes with the following naming convention _Callable_..., e.g.:
public abstract class _Callable_String extends AbstractCallable implements _Callable_0<String> {}
public abstract class _Callable_Object_String extends AbstractCallable implements _Callable_1<Object, String> {}
Note the extension of AbstractCallable class which defines an equals method, the reason for this is that some of the lambdas get wrapped for type conversion purposes (see below for more details).
Reified, primitive, abstract classes are used when there are no generic types and some primitive types. These classes are abstract classes that define an reified-abstract callPrimitive method via extending the equivalent non-primitive _Callable_... classes with the same naming convention as non-primitive (_Callable_...), e.g.:
public abstract class _Callable_int extends _Callable_Integer {
@Override public final Integer call() { return callPrimitive(); }
public abstract int callPrimitive();
}
public abstract class _Callable_int_String extends _Callable_Integer_String {
@Override public final Integer call(String a1) { return callPrimitive(a1); }
public abstract int callPrimitive(String a1);
}
public abstract class _Callable_int_int extends _Callable_Integer_Integer {
@Override public final Integer call(Integer a1) { return callPrimitive(a1); }
public abstract int callPrimitive(int a1);
}
Note that the call method is defined to call the callPrimitive method.
There are two variations of From class and the purpose of both these From classes is to 'convert' one lambda into another. Typically this conversion is achieved by wrapping the first lambda inside itself, but it could in the case of a wrapping and then unwrapping just give back the original lambda. Whether a From... class or a _Callable_... class (the original lambda) results, it is always reified lambda that results from a From conversion.
The reified _Callable_... abstract classes implement the equivalent generic _Callable_n interface and therefore no 'cast' is needed from a reified lambda to a generic lambda.
The first of the two variations of the From classes are those that go in the reverse direction to their inheritance hierarchy, i.e. from a generic _Callable_n interface to a the equivalent reified _Callable_... class (reifying From). The second variation of From class is for contravariant overriding of lambdas.
All From classes follow a naming convention: _From_...__To_... where the first '...' are as used in the 'from' _Callable_... and the second '...' are as used in the 'to' _Callable_... names. If the 'from' _Callable_... is a generic interface then the '...' include the _n as well as the type names.
These classes convert a generic _Callable_n interface to a the equivalent reified _Callable_... class, i.e. they reify the generic class, e.g.:
public final class _From_0_Integer__To_int extends _Callable_int implements Wrapper {
private final _Callable_0<? extends Integer> f;
private _From_0_Integer__To_int(final _Callable_0<? extends Integer> f) { this.f = f; }
public static _Callable_int instance(final _Callable_0<? extends Integer> from) {
if (from == null) { return null; }
final _Callable_0<? extends Integer> f = (_Callable_0<? extends Integer>)Wrappers.unWrap(from);
if (f instanceof _Callable_int) { return (_Callable_int)f; }
return new _From_0_Integer__To_int(f);
}
@Override public int callPrimitive() { return f.call(); }
@Override public Object getWrappee() { return f; }
@Override public int hashCode() { return f.hashCode(); }
}
Note use of callPrimitive; because the resultant, 'to', class is a primitive _Callable_..., if 'to' is a non-primitive _Callable_... then method call is defined instead.
These classes convert a reified _Callable_... class to another, contravariantly overriding, reified _Callable_... class.
Primitive types have an implicit conversion mechanism in Java, e.g. an int is automatically converted to a long. It is necessary to simulate this implicit conversion for lambdas that have primitive and/or primitive wrappers. The term "implicit conversion" is used below to indicate when such a non-inheritance based type conversion is required.
If going from the 'from' to the 'to' lambda types does not involve "implicit conversion" then the dynamically created class is for example:
public final class _From_String_Object__To_Object_String extends _Callable_Object_String implements Wrapper {
private final _Callable_String_Object f;
private _From_String_Object__To_Object_String(final _Callable_String_Object f) { this.f = f; }
public _From_String_Object__To_Object_String instance(final _Callable_String_Object from) {
if (from == null) { return null; }
final _Callable_String_Object f = (_Callable_String_Object)Wrappers.unWrap(from);
return new _From_String_Object__To_Object_String(f);
}
@Override public Object call(final String a1) { return f.call(a1); }
@Override public Object getWrappee() { return f; }
@Override public int hashCode() { return f.hashCode(); }
}
Note if the 'from' lambda is primitive then use f.callPrimitive(a1) and if the 'to' lambda is primitive define method callPrimitive.
If "implicit conversion" is required then there are a number of cases to be covered.
Primitive-wrapper to primitive-wrapper for example:
public final class _From_Integer_Long__To_Long_Integer extends _Callable_Long_Integer implements Wrapper {
private final _Callable_Integer_Long f;
private _From_Integer_Long__To_Long_Integer(final _Callable_Integer_Long f) { this.f = f; }
public _From_Integer_Long__To_Long_Integer instance(final _Callable_Integer_Long from) {
if (from == null) { return null; }
final _Callable_Integer_Long f = (_Callable_Integer_Long)Wrappers.unWrap(from);
return new _From_Integer_Long__To_Long_Integer(f);
}
@Override public Long call(final Integer a1) { return f.call(a1.longValue()).longValue(); }
@Override public Object getWrappee() { return f; }
@Override public int hashCode() { return f.hashCode(); }
}
Primitive-wrapper to primitive for example:
public final class _From_Integer_Long__To_long_int extends _Callable_long_int implements Wrapper {
private final _Callable_Integer_Long f;
private _From_Integer_Long__To_long_int(final _Callable_Integer_Long f) { this.f = f; }
public _From_Integer_Long__To_long_int instance(final _Callable_Integer_Long from) {
if (from == null) { return null; }
final _Callable_Integer_Long f = (_Callable_Integer_Long)Wrappers.unWrap(from);
return new _From_Integer_Long__To_long_int(f);
}
@Override public long callPrimitive(final int a1) { return f.call(Long.valueOf(a1)); }
@Override public Object getWrappee() { return f; }
@Override public int hashCode() { return f.hashCode(); }
}
Primitive to primitive-wrapper for example:
public final static class _From_int_long__To_Long_Integer extends _Callable_Long_Integer implements Wrapper {
private final _Callable_int_long f;
private _From_int_long__To_Long_Integer(final _Callable_int_long f) { this.f = f; }
public _From_int_long__To_Long_Integer instance(final _Callable_int_long from) {
if (from == null) { return null; }
final _Callable_int_long f = (_Callable_int_long)Wrappers.unWrap(from);
return new _From_int_long__To_Long_Integer(f);
}
@Override public Long call(final Integer a1) { return Long.valueOf(f.callPrimitive(a1)); }
@Override public Object getWrappee() { return f; }
@Override public int hashCode() { return f.hashCode(); }
}
Primitive to primitive (which is the same as the no "implicit conversion" case since Java automatically converts primitives) for example:
public final static class _From_int_long__To_long_int extends _Callable_long_int implements Wrapper {
private final _Callable_int_long f;
private _From_int_long__To_long_int(final _Callable_int_long f) { this.f = f; }
public _From_int_long__To_long_int instance(final _Callable_int_long from) {
if (from == null) { return null; }
final _Callable_int_long f = (_Callable_int_long)Wrappers.unWrap(from);
return new _From_int_long__To_long_int(f);
}
@Override public long callPrimitive(final int a1) { return f.callPrimitive(a1); }
@Override public Object getWrappee() { return f; }
@Override public int hashCode() { return f.hashCode(); }
}
These classes provide the same functionality as the _From classes above do for individual lambdas but instead operate on arrays of lambdas. All the _Array_From classes create a new array, for the 'to', and convert each element of 'from'. The name of the _Array_From class is the same as the name of the appropriate _From class with '_Array' prepended, e.g:
public final class _Array_From_0_String__To_String {
public static _Callable_String[] instance(final _Callable_0<? extends String>[] from) {
if (from == null) { return null; }
final _Callable_String[] to = new _Callable_String[from.length];
for (int i = 0; i < from.length; i++) { to[i] = _From_0_String__To_String.instance(from[i]); }
return to;
}
}
To keep the weblog shorter I will limit the explanation to lambdas that have a return value and one argument, it is straightforward to extrapolate to a void lambda and down to zero or up to n arguments. The strawman proposal allows the return type of the lambda to be implied. Inferring the return type is a separate issue to how to implement lambdas and therefore the explanations assume the types are specified (similarly the proposed diamond operator is ignored). There are two basic constructs in lambdas:
#$R$($A$)
Which declares a lambda type
#$R$($A$$N$)($E$)
Which creates a new lambda instance
There is a second new-lambda-instance construct in the strawman that allows statements, however it is easy to extrapolate to multiple statements from a single expression and therefore this form is not described.
In the above:
$R$
The return type of the lambda which can be any type: primitive, class, interface,
or generic type. $R$ may not be a wildcard-generic type.
$A$
The argument type of the lambda which can be any type: primitive, class, interface,
or generic type. $A$ may not be a wildcard-generic type.
$A$ may include a final modifier in addition to the type.
$N$
Is the name of the argument.
$E$
The expression that is executed when the lambda is called.
Generic types in lambdas do not include wildcards, i.e. no ?, ?extends$type$, or ?super$type$. If wildards are used as part of the lambda syntax, then it is an error.
From time to time, when explaining the expansion of lambda syntax, other textual place markers are introduced between dollars, e.g. $type$ as used above. If more than one $A$, for example, is needed then an additional letter is added to clarify which $A$ is referred to, e.g. $AL$ might be used to indicate the $A$ on the left hand side of an assignment.
New lambdas are created by the expression #$R$($A$$N$)($E$).
If either $R$ or $A$ are generic types then if either $R$ or $A$ are primitive they are converted to the equivalent primitive-wrapper classes and the expansion is:
new _Callable_1<$R$, $A$>() {
@Overload public final $R$ call($A$ $N$) { return $E$; }
}
Else the expansion is:
new _Callable_$R$_$A$() {
@Overload public final $R$ call($A$ $N$) { return $E$; }
}
EG #$R$($A$)[] $name$ = new #$R$($A$)[$size$];, the LHS is treated like a definition and the RHS like an instance, except that the RHS has special treatment for the generics case.
If either $R$ or $A$ are generic types then an unchecked warning is given, because you cannot create type-safe-generic arrays. The expansion is:
_Callable_1<? extends $R$, ? super $A$>[] $name$ = (_Callable_1<? extends $R$, ? super $A$>[]) new _Callable_1[$size$];
(There is no syntax in the strawman for a raw-type lambda, hence the need for this special expansion.)
Else the expansion is:
_Callable_$R$_$A$[] $name$ = new _Callable_$R$_$A$[$size$];
If either $R$ or $A$ are generic types then an unchecked cast warning is given, because you cannot check generic casts at compile time, but the expansion proceeds and is:
EG $genericType$<$optionalWildcardSyntax$#$R$($A$)> is treated like the LHS of a variable declaration and is always generic and is therefore expanded to:
$genericType$<$optionalWildcardSyntax$ Callable$1<? extends $R$, ? super $A$>>
To facilitate reifying generic lambdas and contravariant-overriding, 'safe' automatic-conversion via loader-generated From classes are supported for assignment and initialization of variables, method parameters, and array elements (for arrays see next section).
A running example is used below:
$lValue$ = $rExpression$
Where the type, after expansion, of $lValue$ is _Callable_$RL$_$AL$ and the expanded-type of $rExpression$ is either _Callable_1<?extends$RR$,?super$AR$> or _Callable_$RR$_$AR$ depending on if it is reified or not. Note $lValue$ is reified else no 'conversion' is added by the compiler (it is not an error or a warning for $lValue$ to be generic, it just means that the compiler does not need to add a 'conversion').
In the description below the phrase $R$ "is convertible to" $L$ is used to indicate that $R$ has is the same as $L$, $R$ is a subtype of $L$, or $R$ is a primitive or primitive-wrapper that can be converted to $L$ which is also a primitive or primitive-wrapper. The automatic conversion of primitives and primitive-wrappers is usual in Java, e.g. int to long.
In summary, the possible actions are:
If the type of $lValue$ is generic, do nothing.
If the type of $rExpression$ is exactly the same as the type of $lValue$,
do nothing.
If the type of $rExpression$ is generic, possibly insert a reifying conversion
(see section 5.3.1).
If the type of $rExpression$ is reified, possibly insert a
contravariant-overriding conversion (see section 5.3.2).
If the the expanded type of $rExpression$ is _Callable_1<?extends$RR$,?super$AR$> (i.e. it is generic) then a reifying cast is applicable (and as described above lValue should be reified).
If $RR$ or $AR$ are generic-parameter names rather than specific-type names then report an unchecked cast warning and substitute the limits of the parameters into $RR$ and $AR$ respectively, instead of the parameter names, and continue as below.
If $RR$ "is convertible to" $RL$ and if $AL$ "is convertible to" $AR$ then the expansion is:
If the the expanded type of $rExpression$ is _Callable_$RR$_$AR$ (i.e. it is reified) then a reified-to-reified cast is applicable (and as described above lValue should be reified).
If $RR$ is the same type as $RL$ and if $AL$ is the same type as $AR$ then do nothing (no cast necessary).
If $RR$ "is convertible to" $RL$ and if $AL$ "is convertible to" $AR$ then the expansion is:
To facilitate reifying arrays of generic-lambdas and contravariant-overriding of lambda arrays, 'safe' automatic-conversion via loader-generated Array_From classes are supported for assignment and initialization of array variables, method parameters, and array-array elements (for none arrays see previous section).
In the following:
$lValue$ = $rExpression$
The expanded type of:
$lValue$
Is _Callable_1_$RL$_$AL$[] (generic) or _Callable_$RL$_$AL$[] (reified)
$rExpression$
Is _Callable_1_$RR$_$AR$[] (generic) or _Callable_$RR$_$AR$[] (reified)
If $RR$ is $RL$ and if $AR$ is $AL$ then no expansion is necessary.
Else if $lValue$ generic and $RL$ "is convertible to" $RR$ and if $AR$ "is convertible to" $AL$ then no expansion is necessary.
Else if ($RR$ "is convertible to" $RL$ and if $AL$ "is convertible to" $AR$) or
($RL$ "is convertible to" $RR$ and if $AR$ "is convertible to" $AL$) then the expansion is one of:
Reifying lambdas is not particularly straightforward, however it offers improved performance over some erased lambdas schemes and a smoother joining of lambdas to the rest of Java. A high-performance lambda implementation, e.g. reified lambdas, makes ParallelArray worth while. Without supporting high-performance lambdas you might well argue that provide a series of generic Callable interfaces and a short inner-class syntax is sufficient.
Are there any use cases were the above proposal is not ideal? (Other than the already justified, in previous weblog, wrapping for contravariant-overriding conversions.)
I dunno man. I vote we just have a single interface per arity (0 through 255, which is the java limit, IIRC) that takes all object args and returns an object value. Then we box/unbox in the primitive case and work out some compiler optimizations to inline/remove the boxing/unboxing in tight-loop cases.
That makes the closure assignability problem a compiler-time issue rather than a runtime issue, doesn't cause an explosion of types, and lets you pass closures that return primitives to things that expect, say, Comparables relatively trivially.
Giving up some tight-loop runtime speed in the name of simplicity of implementation seems like the right thing to me.
Just having some generic interfaces, I suggested Callable0 to Callable5 (0 to 5 arguments) in the original weblog, and some short inner class syntax is definitely an option. Some of the proposals for the implementation of lambdas are complex (probibly not as complex as reifying, but complex anyway) and yet don't achieve the performance of a reified implementation.
-- Howard.
> I dunno man. I vote we just have a single interface per > arity (0 through 255, which is the java limit, IIRC) that > takes all object args and returns an object value. Then > we box/unbox in the primitive case and work out some > compiler optimizations to inline/remove the > boxing/unboxing in tight-loop cases. > > That makes the closure assignability problem a > compiler-time issue rather than a runtime issue, doesn't > cause an explosion of types, and lets you pass closures > that return primitives to things that expect, say, > Comparables relatively trivially. > > Giving up some tight-loop runtime speed in the name of > simplicity of implementation seems like the right thing to > me. > > Worse is better, > Carson
> Giving up some tight-loop runtime speed in the name of > simplicity of implementation seems like the right thing to > me. That wouldn't be consistent with the stated reason for revisiting closures --- to aid use of parallel APIs. If performance wasn't critical you wouldn't bother with parallel code at all.
Yes, no point in parallizing your code if erasure and hence boxing looses all your performance. I wrote the original post to address this problem since my experiance with the ParallelArray API was that boxing made it very slow (significantly slower than serial code!).
In the Lambda-Dev group Neal Gafter said deified lambdas were hard, so I thought I would see if they were practical and hence the two weblogs. Anyway I think they are practical, but as Neal said hard.
It will be interesting to see if people think the difficult worth the difficulty.
> > Giving up some tight-loop runtime speed in the name of > > simplicity of implementation seems like the right thing > to > > me. > That wouldn't be consistent with the stated reason for > revisiting closures --- to aid use of parallel APIs. If > performance wasn't critical you wouldn't bother with > parallel code at all.