Summary
A minor annoyance with Java generics is that you often end up repeating type parameters in a variable declaration and in the constructor invocation that initializes that variable. Defining a trivial static method allows you to avoid this.
Advertisement
Cédric Beust, in an interesting review of new Java language features, notes something that most people who have used Generics will have seen. You very often end up declaring variables something like this:
Map<String, List<Account>> accounts =
new HashMap<String, List<Account>>();
The repetition is annoying and error-prone. Cédric writes:
Not only is the code significantly harder to read, but it fails to obey the DRY principle ("Don't repeat yourself"). What if I need to change the value type of this map from List<Account> to Collection<Account>? I need to replace all these statements everywhere in my code. While IDE refactoring will help, it's still an awful lot of code for a modification of this kind that hardly impacts the semantics of this code.
Fortunately, there is a simple way to define Util.newMap() such that you can write:
public static <K,V> Map<K,V> newMap() {
return new HashMap<K,V>();
}
An additional advantage is that you don't have to write the ugly concrete type HashMap in your code.
The compiler can figure out what kind of Map you want if you assign to a Map variable, though not if you pass Util.newMap() as a parameter to a method. Then you might need a temporary variable.
Obviously you can do the same thing for other generic types like List and Set.
Basically, when definining a very complex type got too laborious, one might write a template function that "inferred" the correct return type from the type of the argument. Placing a call to this function in place of a type declaration would save you from having to manually build very complex parameterized types.
Its sort of a crutch though. Dynamic languages make all of this madness go away.
Keith Lea wrote: > I think a better RFE would be one which allowed type > inference for constructors, so we wouldn't need these > messy methods.
I agree, but I think you're going to see new static methods in java.util.Collections sooner than you're going to see more language changes!
Unfortunately, the obvious syntax to ask for type inference, just leaving out the type parameters (as you can for method invocations), doesn't work because it already means constructing an instance of the raw type. I.e.:
List<String> stringList = new ArrayList();
is already allowed, but it means to construct a raw ArrayList, and will draw a warning from the compiler.
The next most obvious syntax would be to put in the angle brackets for the type parameters, but leave the parameters themselves out, like this:
List<String> stringList = new ArrayList<>();
This would satisfy DRY, and solve lots more cases than just collections. But I'm not sure when you might hope to see it in the language.
In any case, there is another advantage to the proposed RFE, which is that you no longer need to know the concrete collection types. If you want a Map, it's Collections.newMap(); if you want a List, it's Collections.newList(); etc. None of this ArrayMap or HashList nonsense to carry around with you.
Well, the problem with that is that you can easily create an instance to assign to the first declaration:
List<Person> team = new ArrayList<Person>();
but if you want to initialize the People variable then you need to either write a class that implements People or do some hairy tricks with dynamic proxies. This is not to say that it's not a good idea anyway.
In practice I've found that I very often create a non-public class (not an interface), something like:
class People extends ArrayList<Person> {}
As soon as data structures start getting at all complicated you are probably going to want to start adding methods to them or refining existing methods -- just parameterizing an existing Collections type doesn't get you very far, and perhaps isn't very Objectly Correct.
I'd guess that if you asked the Java language keepers about typedefs they would make this argument, too. An alias for an existing parameterized type is rarely sufficient for long.
Oh yes, my 5-hours-of-coding, sleep deprived brain would really deal with much less madness in a dynamically typed language where I'm writing a function that takes in a parameter. What's the parameter? Oh - get this, it's a Map-of-a-string-to-list-of-account-objects. Yeeeeeaaaahhhh. Even better - I need to pass some parameter in, but the api doesn't quite document it. Just how long do you think it's going to take to figure out that it's Map<String, List<Account>>?
To everyone reading this, I'm writing this at 4am so I'm probably just tired and bitter after a long week of work. But I'm so *sick* of the repeated insinuation that dynamically typed languages make type issues go away. It's a stupid argument, but one that people on message boards repeat over and over and over.
> Oh yes, my 5-hours-of-coding, sleep deprived brain would > really deal with much less madness in a dynamically typed > language where I'm writing a function that takes in a > parameter. What's the parameter? Oh - get this, it's a > Map-of-a-string-to-list-of-account-objects. > Yeeeeeaaaahhhh. Even better - I need to pass some > parameter in, but the api doesn't quite document it. Just > how long do you think it's going to take to figure out > that it's Map<String, List<Account>>?
Dynamic languages are no panacea, but one of the subtle ways that they help is that people are less inclined to pass around a "map of a string to a list of accounts" than they are to make a class with a simple protocol that happens to hold one and pass that around instead... for the reason you name, it's just clearer. It's kinda neat how what you have to do to survive in those languages often ends up making the code nicer too. But, like anything else, people can screw it up.
> but if you want to initialize the People variable then you need to either write a class that implements People..
Yes, indeed. You would need, say, a PeopleImpl that implemented People and in many cases extended a concrete collection type. That's the price you pay for that extra layer of abstraction. As usualy there are tradeoffs with this. For code that doesn't need to know the specifics of how the collection is realized but just needs to traverse and use that collection, it produces very clean code. The negative side is that is obscures the 'listiness' of the abstraction, which even in an abstract model is a useful detail.
> In any case, there is another advantage to the proposed > RFE, which is that you no longer need to know the concrete > collection types. If you want a Map, it's > Collections.newMap(); if you want a > List, it's > Collections.newList(); etc. None of this > ArrayMap or HashList nonsense to > carry around with you.
Is it really nonsense? There are currently three full-blown concrete implementations of List: ArrayList, Vector and LinkedList.
Mostly everybody uses ArrayList over LinkedList (and I must admit to only ever thinking of using LinkedList once or twice). Why? My guesses are that: a) most people used to use Vector and ArrayList is essentially the new Vector (and they've chosed to switch due to the perceived increase in speed of ArrayList over Vector), or b) people are (probably subconsciously) 'premature optimizing' and assuming that ArrayList is faster. (Which may or may not be based on their knowledge of frequency of inserts/deletes/sorts/adds versus random element access.) Why should ArrayList be the default implementation for Collections.newList()?
I think the static factory method technique you've described is useful (and could be expanded as required to solve a variety of requirements) but I question whether the addition of such methods into Collections is appropriate.
Brendan Boesen writes: > Mostly everybody uses ArrayList over LinkedList (and I > must admit to only ever thinking of using LinkedList once > or twice). Why? My guesses are that: > a) most people used to use Vector and ArrayList is > essentially the new Vector (and they've chosed to switch > due to the perceived increase in speed of ArrayList over > Vector), or > b) people are (probably subconsciously) 'premature > optimizing' and assuming that ArrayList is faster. (Which > may or may not be based on their knowledge of frequency of > inserts/deletes/sorts/adds versus random element access.) > Why should ArrayList be the default implementation for > Collections.newList()?
I'd say that 95% of the time it doesn't make the slightest difference what List implementation you use, either because the number of elements in the list is very small, or because all you ever do is add elements to the end and iterate over the list. The 5% of the time it does make a difference, by all means write ArrayList or LinkedList explicitly. This is much better than always having to choose one of these concrete types. Not only do beginning users not have to learn a bunch of concrete types initially, but the implication is that if you do write "new ArrayList<String>()" rather than "Collections.newList()", it is because you really do want the properties of ArrayList. It would make good sense for Collections.newList not to specify what kind of List it returns, so that if it matters to your code you say so explicitly. An implementation could even choose to have Collections.newList return some sort of adaptive List that switches from an ArrayListoid to a LinkedListoid according as the usage is more insert/remove or get/set, provided it serialized as some standard List.