What's New in Scala 2.8: Chained Package Clauses

by Martin Odersky

September 7, 2010

Summary

The second installment of a series of articles on the latest Scala release, Scala 2.8, Martin Odersky explains how and why packages and imports have changed in 2.8.

If you have already looked at some larger Scala 2.8 programs you might have been puzzled to see that source files now sometimes start with multiple chained package clauses, like this:

  package org.myproject
  package tests
  ...

What does this mean? In fact it means exactly the same as writing two nested package clauses, like this:

  package org.myproject {
    package tests {
      ...
    }
  }

Nested package clauses, while perfectly legal, are not used that often in Scala because their syntax is more heavyweight than single or chained package clauses. Their advantage is that they make the nesting structure of packages explicit. But before Scala 2.8, most programmers would have opted for a single package clause like this:

  package org.myproject.tests
  ...

What Has Changed

Previously the single package clause:

  package org.myproject.tests
  ...

was equivalent to the two nested clauses:

  package org.myproject {
    package tests {
      ...
    }
  }

and both were equivalent to the three nested clauses below.

  package org {
    package myproject {
      package tests {
        ...
      }
    }
  }

They all meant the same. In Scala 2.8 this has changed. The difference is that the package clause:

  package org.myproject.tests

now only brings the members of the org.myproject.tests in scope, but not the members of the two outer packages org.myproject and org. If you want to have the members of both tests and org.myproject in scope you need to use the two nested package clauses above, or else the chained equivalent:

  package org.myproject // myproject members are visible, but not org members
  package tests         // test members are visible

Why the Change?

The meaning of packages in Scala and Java differs in some respects. Java has four different name resolution mechanisms: one each for methods, fields, classes, and packages. For packages the resolution is that you would always start at the root when resolving a package name. So package names are always absolute, never relative to the current scope. Scala is simpler and more regular. There's only one way to a resolve a simple name, no matter what kind of name the compiler is looking at.

When it encounters a simple name, x, the compiler will look from the current scope outwards until it finds a declaration of x, and that is the declaration that's chosen. This is the standard idea of block-structured scoping. A consequence is that declarations in a nested scope can shadow declarations in some outer scope.

Like all other entities, packages can be nested in Scala. This often leads to more concise code. For instance, assume you have another package org.myproject.web which you want to access inside the package org.myproject.tests. You can do so easily by just writing

  import web._

There's no need to import the whole path

  import org.myprojec.web._

As package names grow longer, this style can reduce quite a bit of clutter. So this nesting convention is useful and it is also quite elegant because it needs no specialized rules for packages of the kind Java has.

Unfortunately, it turns out that the nesting convention had an unwelcome side effect in practice. Sometimes people would encounter a nested package with the same name as a root package. For instance, they might have a package org.java somewhere on the classpath. Now, Scala's original nesting convention up to version 2.7 prescribed that inside package org.myproject the name java should resolve to org.java instead of the root package java. This caused confusion and consternation in users who might not even have known that the org domain contained a java package which hides the root java they wanted to access. Worse, the availability of the nested java would depend on how the classpath was defined, what jars were on it, etc. Who knows at all times what's on their classpath?

Of course, there was a way around it. Scala has a name for the "root" package that contains all other top-level packages. It's named _root_. So people could refer to the standard java package as _root_.java. However, to be defensive, you'd have to qualify all references to top level packages with _root_, just to make sure that they would not be accidentally hidden by a like-named nested package.

In summary we were faced with the situation that the shortest and most straightforward way to organize package clauses was fragile because nested packages that ended up accidentally on the classpath could shadow top-level packages. The more defensive approach that avoided the problem was comparatively verbose and ugly.

Change Summary

So the solution of the problem was to change the meaning of a qualified package clause like:

  package org.myproject.tests

It now brings only members of tests into scope, not members of the two outer packages containing it. This changes nothing in the way Scala packages nest. Package tests is still a member of package org.myproject. The only thing that changes is what scope is opened by a qualified package clause like the one above.

The second part of the change is to allow the new chained package clause syntax as a more compact alternative to nested packages. So if you want to see members of both package org.myproject and org.myproject.tests in your code, you can use two package clauses:

  package org.myproject
  package tests

Generally, if your project consists of multiple subpackages, it's a good idea to use a first package clause that names the project as a whole and is followed by a second package clause naming the current subpackage. That way, you can refer to other subpackages of your project without the often long prefix that indicates the project.

Migrating to the New Scheme

Because of the change, previous Scala code that is spread out over multiple supackages of a common base package will most likely no longer compile correctly. References to different subpackages in the same project will not be picked up correctly (unless you always used absolute paths, then you should not have a problem).

The fix is fortunately straightforward. Simply replace each leading package clause by two or more package clauses that name the current project and then the current subpackage(s). This can be done by a simple multi-file regular expression search and replace operation.

The Force of Convention

In passing, it turns out that the old Scala 2.7 rules are the same as the rules in C# and other .Net languages. So why did something that obviously works on .Net cause such problems on the JVM? It's a matter of expectations and conventions. In .Net, which has nested namespaces similar to Scala's packages, nobody in their right mind would have defined a namespace org.System because it would shadow the well-known top level System namespace. On the JVM, people do this sort of thing, and it works, because of Java's absolute package name convention. So this experience shows that sometimes a design cannot be judged to be right or wrong only along technical criteria, but that it matters how it fits with the pre-existing conventions and expectations of its users. Scala 2.7's nested packages are a simple design that works well on .Net. It did not work so well on the JVM because conventions there were different. In 2.8 we fixed the problem by giving a new twist to the interpretation to package clauses.

Version Numbers

In fact, strictly speaking, the package clause change would have forced us to bump up the major version number of the language. Scala's unofficial versioning scheme is as follows:

Backwards incompatible changes in the language demand a new major version number, like the change from Scala 1 to Scala 2 in March 2005. The major version number would also have to be changed if there are sufficient additions to turn Scala into what's effectively a different language. So Scala 3 should mean something substantially different from Scala 2.
New language features and backwards incompatible changes in the libraries demand a point number increment, e.g., from Scala 2.7 to 2.8. Backwards incompatible changes in libraries should be phased in slowly with -deprecation warnings or, where this is not possible, should be accompanied -Xmigration warnings.
Backwards compatible library additions and bug fixes only demand a change in third version number, e.g., Scala 2.8.0 to 2.8.1.

This versioning scheme is simply what we tried to follow so far; it does not have any official status, nor should it be assumed that it will always be like this.

The change in the meaning of package clauses is not backwards compatible and therefore would have demanded a major version jump to 3.0. However, the change came late in the run-up to 2.8, and was prompted by urgent requests from our users. At the time we had to choose among three possibilities: The first option was to go directly from 2.7 to 3.0. However, there were already books in print that talked about Scala 2.8. Readers of those books would have been unnecessarily confused. Besides, it's not nice to end up with a phantom version of a language that people talked a lot about but that never saw the light of day. The second option was to wait with the change to package clauses until after 2.8. This was unattractive precisely because the change can break existing code. If you need to do the change, the sooner you do it the better. The longer you wait, the more code there will be that might break. The third option was to make an exception to our numbering scheme, and that's the one we picked in the end.

Acknowledgements

The change in the meaning of qualified package clauses was repeatedly proposed by Jorge Ortiz. I thank him for his persistence. The new chained package clause syntax was originally proposed by David MacIver.

Resources

Martin Odersky is coauthor of Programming in Scala:
http://www.artima.com/shop/programming_in_scala

The Scala programming language website is at:
http://www.scala-lang.org

The Scala 2.8 release notes are at:
http://www.scala-lang.org/node/7009

The Scaladoc collections API is at:
http://lampwww.epfl.ch/~odersky/whatsnew/collections-api/collections.html

Talk back!

Have an opinion? Readers have already posted 2 comments about this article. Why not add yours?

About the author

Martin Odersky is the creator of the Scala language. As a professor at EPFL in Lausanne, Switzerland, he works on programming languages, more specifically languages for object-oriented and functional programming. His research thesis is that the two paradigms are two sides of the same coin, to be unified as much as possible. To prove this, he has experimented with a number of language designs, from Pizza to GJ to Functional Nets. He has also influenced the development of Java as a co-designer of Java generics and as the original author of the current javac reference compiler. Since 2001 he has concentrated on designing, implementing, and refining the Scala programming language.