If you have already looked at some larger Scala 2.8 programs you might have been puzzled to see that source files now sometimes start with multiple chained package clauses, like this:
package org.myproject package tests ...
What does this mean? In fact it means exactly the same as writing two nested package clauses, like this:
package org.myproject { package tests { ... } }
Nested package clauses, while perfectly legal, are not used that often in Scala because their syntax is more heavyweight than single or chained package clauses. Their advantage is that they make the nesting structure of packages explicit. But before Scala 2.8, most programmers would have opted for a single package clause like this:
package org.myproject.tests ...
Previously the single package clause:
package org.myproject.tests ...
was equivalent to the two nested clauses:
package org.myproject { package tests { ... } }
and both were equivalent to the three nested clauses below.
package org { package myproject { package tests { ... } } }
They all meant the same. In Scala 2.8 this has changed. The difference is that the package clause:
package org.myproject.tests
now only brings the members of the org.myproject.tests
in scope, but not the members of the two outer packages org.myproject
and org
. If you want to have the members of both tests
and org.myproject
in scope you need to use the two nested package clauses above, or else the chained equivalent:
package org.myproject // myproject members are visible, but not org members package tests // test members are visible
The meaning of packages in Scala and Java differs in some respects. Java has four different name resolution mechanisms: one each for methods, fields, classes, and packages. For packages the resolution is that you would always start at the root when resolving a package name. So package names are always absolute, never relative to the current scope. Scala is simpler and more regular. There's only one way to a resolve a simple name, no matter what kind of name the compiler is looking at.
When it encounters a simple name, x
, the compiler will look from the current scope outwards until it finds a declaration of x
, and that is the declaration that's chosen. This is the standard idea of block-structured scoping. A consequence is that declarations in a nested scope can shadow declarations in some outer scope.
Like all other entities, packages can be nested in Scala. This often leads to more concise code. For instance, assume you have another package org.myproject.web
which you want to access inside the package org.myproject.tests
. You can do so easily by just writing
import web._
There's no need to import the whole path
import org.myprojec.web._
As package names grow longer, this style can reduce quite a bit of clutter. So this nesting convention is useful and it is also quite elegant because it needs no specialized rules for packages of the kind Java has.
Unfortunately, it turns out that the nesting convention had an unwelcome side effect in practice. Sometimes people would encounter a nested package with the same name as a root package. For instance, they might have a package org.java
somewhere on the classpath. Now, Scala's original nesting convention up to version 2.7 prescribed that inside package org.myproject
the name java
should resolve to org.java
instead of the root package java
. This caused confusion and consternation in users who might not even have known that the org
domain contained a java
package which hides the root java
they wanted to access. Worse, the availability of the nested java
would depend on how the classpath was defined, what jars were on it, etc. Who knows at all times what's on their classpath?
Of course, there was a way around it. Scala has a name for the "root" package that contains all other top-level packages. It's named _root_
. So people could refer to the standard java
package as _root_.java
. However, to be defensive, you'd have to qualify all references to top level packages with _root_
, just to make sure that they would not be accidentally hidden by a like-named nested package.
In summary we were faced with the situation that the shortest and most straightforward way to organize package clauses was fragile because nested packages that ended up accidentally on the classpath could shadow top-level packages. The more defensive approach that avoided the problem was comparatively verbose and ugly.
So the solution of the problem was to change the meaning of a qualified package clause like:
package org.myproject.tests
It now brings only members of tests
into scope, not members of the two outer packages containing it. This changes nothing in the way Scala packages nest. Package tests
is still a member of package org.myproject
. The only thing that changes is what scope is opened by a qualified package clause like the one above.
The second part of the change is to allow the new chained package clause syntax as a more compact alternative to nested packages. So if you want to see members of both package org.myproject
and org.myproject.tests
in your code, you can use two package clauses:
package org.myproject package tests
Generally, if your project consists of multiple subpackages, it's a good idea to use a first package clause that names the project as a whole and is followed by a second package clause naming the current subpackage. That way, you can refer to other subpackages of your project without the often long prefix that indicates the project.
Because of the change, previous Scala code that is spread out over multiple supackages of a common base package will most likely no longer compile correctly. References to different subpackages in the same project will not be picked up correctly (unless you always used absolute paths, then you should not have a problem).
The fix is fortunately straightforward. Simply replace each leading package clause by two or more package clauses that name the current project and then the current subpackage(s). This can be done by a simple multi-file regular expression search and replace operation.
In passing, it turns out that the old Scala 2.7 rules are the same as the rules in C#
and other .Net
languages. So why did something that obviously works on .Net
cause such problems on the JVM? It's a matter of expectations and conventions. In .Net
, which has nested namespaces similar to Scala's packages, nobody in their right mind would have defined a namespace org.System
because it would shadow the well-known top level System
namespace. On the JVM, people do this sort of thing, and it works, because of Java's absolute package name convention. So this experience shows that sometimes a design cannot be judged to be right or wrong only along technical criteria, but that it matters how it fits with the pre-existing conventions and expectations of its users. Scala 2.7's nested packages are a simple design that works well on .Net
. It did not work so well on the JVM because conventions there were different. In 2.8 we fixed the problem by giving a new twist to the interpretation to package clauses.
In fact, strictly speaking, the package clause change would have forced us to bump up the major version number of the language. Scala's unofficial versioning scheme is as follows:
-deprecation
warnings or, where this is not possible, should be accompanied -Xmigration
warnings.This versioning scheme is simply what we tried to follow so far; it does not have any official status, nor should it be assumed that it will always be like this.
The change in the meaning of package clauses is not backwards compatible and therefore would have demanded a major version jump to 3.0. However, the change came late in the run-up to 2.8, and was prompted by urgent requests from our users. At the time we had to choose among three possibilities: The first option was to go directly from 2.7 to 3.0. However, there were already books in print that talked about Scala 2.8. Readers of those books would have been unnecessarily confused. Besides, it's not nice to end up with a phantom version of a language that people talked a lot about but that never saw the light of day. The second option was to wait with the change to package clauses until after 2.8. This was unattractive precisely because the change can break existing code. If you need to do the change, the sooner you do it the better. The longer you wait, the more code there will be that might break. The third option was to make an exception to our numbering scheme, and that's the one we picked in the end.
The change in the meaning of qualified package clauses was repeatedly proposed by Jorge Ortiz. I thank him for his persistence. The new chained package clause syntax was originally proposed by David MacIver.
Martin Odersky is coauthor of Programming in Scala: http://www.artima.com/shop/programming_in_scala |
The Scala programming language website is at:
http://www.scala-lang.org
The Scala 2.8 release notes are at:
http://www.scala-lang.org/node/7009
The Scaladoc collections API is at:
http://lampwww.epfl.ch/~odersky/whatsnew/collections-api/collections.html
Have an opinion? Readers have already posted 2 comments about this article. Why not add yours?
Martin Odersky is the creator of the Scala language. As a professor at EPFL in Lausanne, Switzerland, he works on programming languages, more specifically languages for object-oriented and functional programming. His research thesis is that the two paradigms are two sides of the same coin, to be unified as much as possible. To prove this, he has experimented with a number of language designs, from Pizza to GJ to Functional Nets. He has also influenced the development of Java as a co-designer of Java generics and as the original author of the current javac reference compiler. Since 2001 he has concentrated on designing, implementing, and refining the Scala programming language.
Artima provides consulting and training services to help you make the most of Scala, reactive
and functional programming, enterprise systems, big data, and testing.