Summary
Recently, at Devoxx, it was announced that Java would get lambda functions (a.k.a. anonymous functions or closures). There are many ways of implementing these on the JVM. This post proposes that reifying lambdas is a good choice.
There are many ways to implement Lambdas on the JVM, one possibility is to erase their actual types and use signatures in class files (much like generics at present). Neal Gafter has given an example (http://mail.openjdk.java.net/pipermail/lambda-dev/2009-December/000192.html) that is difficult to do without erasing the types:
<T> #T() constant(T t) { return #() (t); }
The syntax used is from the referenced strawman:
#T()
Means a lambda that returns a T when called and has no arguments.
#()(t)
Means create a new lambda that returns t, the type T is inferred and the
expression could be written #T()(t).
The above 'constant' example will be used as a running example below.
Like generics the erasure option has problems, e.g.:
1. Can't, with some erasures proposals, have two methods of the same name only
distinguished by lambda
type, e.g. filter(#boolean(int)) and filter(#boolean(float)) illegal
if both in same class
2. Can't have arrays of lambdas that are type safe, e.g.
new#int()[n];//Illegal
3. Can't have instanceof tests or class references, e.g.
#int().class;//Illegal
This post proposes an alternative technique that doesn't erase type information except when used in conjunction with generics.
You could overcome the limitations of erasure given above in many cases if you had reified lambdas (i.e. lambdas that didn't erase the type). For example supose that:
#String() constant(String t) { return #() (t); }
Was translated to:
Callable$String constant(String t) {
return new Callable$String() {
@Override public String call() { return t; }
};
}
Where:
public interface Callable$0<R> { R call(); }
public interface Callable$String extends Callable$0<String> {}
The types Callable$... are created as needed by the class loader (they are not emitted by the compiler, but for typing purposes they are known internally to the compiler).
The case of primitives is more complicated because you want to avoid boxing, e.g:
#int() constant(int t) { return #() (t); }
Becomes:
Callable$int constant(int t) {
return new Callable$int() {
@Override public int call_int() { return t; }
};
}
Where:
public interface Callable$Integer extends Callable$0<Integer> {}
public abstract class Callable$int implements Callable$Integer {
@Override public final Integer call() { return call_int(); }
public abstract int call_int();
}
And in the generic's case (the original example) you would erase the type (because it is already erased):
<T> #T() constant(T t) { return #() (t); }
Is:
<T> Callable$0<T> constant(T t) {
return new Callable$0<T>() {
@Override public T call() { return t; }
};
}
You would use the most specific type when a function type is given, e.g.:
#Object(String) oS;
#String(Object) sO = #String(Object o) (o.toString());
oS = sO;
#Object(String)[] oSs = new #Object(String)[] {oS, sO};
#O(S) gOS = oS; // S = String, O = Object
boolean test = gOS instanceof #Object(String)
List<? extends #Object(String)> lOS;
List<? extends #String(Object)> lSO = new ArrayList<#String(Object)>();
lOS = lSO;
List<#int()> lI = new ArrayList<#int()>();
for (int ii = 0; ii < 3; ii++) { lI.add(constant(ii)); }
#int() i = lI.get(0);
Are translate to:
Callable$Object$String oS;
Callable$String$Object sO = new Callable$String$Object() {
@Override public String call(final Object o) { return o.toString(); }
};
oS = From$1$String$Object$$To$Object$String.cast(sO);
Callable$Object$String[] oSs =
new Callable$Object$String[] {oS, From$1$String$Object$$To$Object$String.cast(sO)};
Callable$1<Object,String> gOS = oS;
boolean test = gOS instanceof Callable$Object$String;
List<? extends Callable$1<? extends Object, ? super String>> lOS;
List<? extends Callable$1<? extends String, ? super Object>> lSO =
new ArrayList<Callable$1<? extends String, ? super Object>>();
lOS = lSO;
List<Callable$0<? extends Integer>> lI = new ArrayList<Callable$0<? extends Integer>>();
for (int ii = 0; ii < 3; ii++) { lI.add(constant(ii)); }
Callable$int i = From$0$Integer$$To$int.cast(lI.get(0));
Where:
public interface Callable$Object$String extends Callable$1<Object, String> {}
public interface Wrapper { Object getWrapee(); }
public static final class Wrappers {
private Wrappers() {}
public static Object unWrap(final Wrapper from) {
Object wrapee = from.getWrapee();
while (wrapee instanceof Wrapper) { wrapee = ((Wrapper)wrapee).getWrapee(); }
return wrapee;
}
public static boolean equals(final Wrapper self, final Object other) {
if (self == other) { return true; }
if (other == null) { return false; }
final Object o = (other instanceof Wrapper) ? unWrap((Wrapper)other) : other;
final Object s = unWrap(self);
return o == s;
}
}
public interface Wrapper$Object$String extends Callable$Object$String, Wrapper {}
public interface Callable$String$Object extends Callable$1<String, Object> {}
public final static class From$1$String$Object$$To$Object$String {
private From$1$String$Object$$To$Object$String() {}
public static Callable$Object$String cast(final Callable$1<? extends String, ? super Object> from) {
return new Wrapper$Object$String() {
@Override public Object call(final String s) { return from.call(s); }
@Override public Object getWrapee() { return from; }
@Override public boolean equals(final Object other) { return Wrappers.equals(this, other); }
@Override public int hashCode() { return Wrappers.unWrap(this).hashCode(); }
};
}
}
public static abstract class Wrapper$int extends Callable$int implements Wrapper {}
public final static class From$0$Integer$$To$int {
private From$0$Integer$$To$int() {}
public static Callable$int cast(final Callable$0<? extends Integer> from) {
if (from instanceof Callable$int) { return (Callable$int)from; }
if (from instanceof Wrapper) {
final Object wrapee = Wrappers.unWrap((Wrapper)from);
if (wrapee instanceof Callable$int) { return (Callable$int)wrapee; }
}
return new Wrapper$int() {
@Override public int call_int() { return from.call(); }
@Override public Object getWrapee() { return from; }
@Override public boolean equals(final Object other) { return Wrappers.equals(this, other); }
@Override public int hashCode() { return Wrappers.unWrap(this).hashCode(); }
};
}
}
The cast methods create a new object; which is wasteful, but on the other hand they aren't needed often (see next section). The erasure schemes don't require 'cast' objects and therefore avoid this overhead, but on the other hand the erasure versions have similar limitations to erased generics. For the generic cases the use of wild-cards avoid the need for the cast methods (since the generics have already erased the types). The From$...$$To$... classes, like the Callable$... classes, are generated by the class loader on demand and not written by the compiler (but for typing purposes are known to the compiler).
The cast methods above are used 'mainly' for contravariant overriding. Most OO languages do not support contravariant overriding (except for generics), but for a moment suppose that Java did allow ontravariant overriding:
abstract class Base { abstract Object m(String s); }
class Derived extends Base {
@Override String m(Object o) { return o.toString(); } // Illegal arg should be String
}
The above is type safe because the arguments to m vary contravariantly and the return types vary covariantly. The argument of m varies contravariantly in the derived class, DerivedextendsBase, but m's argument is Object and Object is the super type of String (which is m's argument type in Base). As m's enclosing classes go down the hierarchy, Base to Derived, its arguments go in the contra direction (up), from String to Object.
The above example shows that this can be useful, and that is the main purpose of providing
'cast' methods. However the above ability is not really missed in Java and is not provided by most of the common OO languages; therefore I infer that contravariance is not that common a requirement and hence the method suggested above is the best solution. In particular, since contravariance is not common, it makes sense to make that the expensive operation and make more common operations fast.
Reifying lambdas makes them easier to use them since they behave like normal Java objects, but on the other hand the implementation is harder. The implementation is harder because you essentially have an erased and a reified version.
How valuable is reifying lambdas?
(Updated 2 Jan 10 in response to comments from Neal Gafter and Zdenek Tronicek and updated 3 Jan 10 in response to comments from Zdenek Tronicek - Thanks.)
If primitive types were to be erased in lambdas does not one of the main justifications for including them in Java, the parallel array api, then disappear? The proliferation of interfaces in that API is entirely a result of wanting to avoid erasure of primitives (and consequent performance loss).
There are a number of fatal problems with this approach. The worst is that you can't use any explicit code to implement subtype conversions because they don't work through generics. For example, your scheme does not provide any way to convert a value of type
List<? extends #String(Object)>
to a value of type
List<? extends #Object(String)>
This conversion is just as important as the conversion on the underlying function types.
Also, there need to be subtype relations in both directions between your Callable$0<String> and Callable$String but that is impossible because they are distinct interfaces. If you try to provide the subtype relations in just one direction, things quicky get out of hand (i.e. an exponential explosion in direct superinterfaces of the generated interfaces) when there are multiple lambda arguments.
> Also, there need to be subtype relations in both > directions between your Callable$0<String> and > Callable$String but that is impossible because they are > distinct interfaces. If you try to provide the subtype > relations in just one direction, things quicky get out of > hand (i.e. an exponential explosion in direct > superinterfaces of the generated interfaces) when there > are multiple lambda arguments.
The above modification to implement Callable$... solves this problem in one direction and the other is via 'cast' methods. The idea was that the 'cast' methods are loader generated, though re-reading my post I didn't make this clear (sorry).
I am making the assumption that these 'casts' are rare and therefore in practice you won't get an explosion of 'cast' methods.
> Howard: Your "answer" is nonresponsive to the issues I've > raised. > > This conversation is on the lambda-dev mailing list, and > I'm tired of copying my responses onto two separate > forums. You can read my full response here: > http://mail.openjdk.java.net/pipermail/lambda-dev/2009-Dece > mber/000212.html
Neal,
I have updated the blog to provide more information - thanks for the input.