Summary:
In this installment of a series of articles on the latest Scala release, Scala 2.8, Martin Odersky and Lex Spoon explain how the collections library was redesigned in 2.8, and how to extend the library with new collection types.
The ability to add new comments in this discussion is temporarily disabled.
Most recent reply: December 28, 2010 7:48 PM by
Cameron
|
In this installment of a series of articles on the latest Scala release, Scala 2.8, Martin Odersky and Lex Spoon explain how the collections library was redesigned in 2.8, and how to extend the library with new collection types. http://www.artima.com/scalazine/articles/scala_collections_architecture.html
|
|
|
I use collections a ton in Java. And I've done years of programming for DNA sequencing. (BTW, you probably want 1 bit per possible base to cover heterozygotes, at which point it's probably simpler to just waste a factor of 2 in space and use 1 byte per base instead of the packing.)
I've tried several times to understand the article. And I'll try again. Between Scala's syntax, which I admit I don't know much about but doesn't resemble any language I know and seems deliberately terse and "weird", (e.g., what is a -Elem and a +To?) and the complexities and layering of the Scala Collection Architecture, I'm lost.
If I'm lost, I suspect many others are too.
|
|
|
If you don't know Scala, this article is not for you. You are much better off reading the previous article I published which describes collections from a user's point of view: http://www.artima.com/scalazine/articles/scala_collections.htmlEven if you do know Scala, you should read this article first. The present article is for experts who want to understand how collections hang together and how one can integrate new collections in the framework so that all operations are automatically specialized to the proper types. That's a non-issue for traditional CRUD collections, but very hard to achieve for the kind of functional / parallel collections Scala has. In fact I know of no other language or framework that even tries this. -Elem and +T mean contravariant and covariant btw. In C# 4.0 that would be "out Elem" and "in T". It's roughly comparable to Java's wildcards "? super Elem" and "? extends T" but is declaration site rather than use site. Variance is hard, that's why none of the three notations is particularly easy (-Elem, +T is the most standard, if you go by the literature).
|
|
|
Thanks. I'll look at the earlier article first.
I was guessing that + and - were roughly "extends" / "super", and, in practice, I'd have a 50:50 chance of picking the right one. :-)
|
|
|
Really nice article!
Just wondering what the memory footprint of these classes is like?
The Java Collections library is pretty bad wrt HashMap, HashSet and the like.
michael
|
|
|
> > Just wondering what the memory footprint of these classes > is like? >
The standard framework mixin traits have no per-object memory overhead, all they add is extra methods. The net memory footprint hence depends on the chosen collection implementation. Here, for small collections, immutable generally wins. For instance, an immutable Set of 4 elements is represented as just a single object with 4 fields, so typically 20 bytes. By contrast, a mutable HashSet would consume typically ~ 100 bytes.
|
|
|
> Just wondering what the memory footprint of these classes is like? > The Java Collections library is pretty bad wrt HashMap, HashSet and the like. HashSet uses a HashMap, so let's focus just on HashMap. For a general-purpose key/value dictionary that uses key-based hashing, is it bad? Or is it bad compared to other algorithms? Peace, Cameron Purdy | Oracle Coherence http://coherence.oracle.com
|
|