Development on the Java Filesystem API has been making steady progress in the context of JSR 203, More New I/O APIs for the JavaTM Platform (NIO.2). While many of the new APIs are already available in JDK 6, the entire NIO.2 feature set will be a key update of the upcoming JDK 7 release. In a recent Sun Developer Network article, The Java NIO.2 File System in JDK 7, Jenice Heiss and Sharon Zakhour take a look at the most important new JSR 203 features from the perspective of interacting with the filesystem:
The Java I/O File API, as it was originally created, presented challenges for developers. It was not initially written to be extended. Many of the methods were created without exceptions, so they failed to throw I/O exceptions, which resulted in considerable frustration for developers. Applications often failed during file deletion, leaving developers confused as to why no useful error message had been generated. The rename method behaved inconsistently across volumes and file systems: Some were easily renamed, but others were not. Methods for gaining simultaneous metadata about files were inefficient. And developers wanted greater access to metadata such as file permissions, as well as more efficient file copy support and file change notification... Developers also requested the ability to develop their own file system implementations by, for example, keeping a pseudofile system in memory, or by formatting files as zip files.
Most of these requests were addressed in JSR 203, including providing a more manageable way to access files. Instead of File, a developer would now access files through the FilePath API:
A path is a file reference that locates a file using a system-dependent path. In other words, it is a path to a file in the file system. The file itself is not required to exist.
The new file access API also provides more user-friendly operations on files:
Syntactic types of operations allow developers to, among other things, manipulate paths, get a parent directory, extract path components, and iterate over components of the path. A second type of file operation uses the path to locate a file in order to perform an operation, like create files, open a file for I/O, delete it, create a directory, and so on...
Another problem with the old filesystem API had to with listing directory contents:
In the java.io.file API created for Java version 1.0, list and list file methods returned an array of the names of files and directories. These methods did not scale to large directories, so in listing a large directory over a network, the list method might hang for long periods. If an application was serving multiple clients or getting directory lists, the virtual machine (VM) might run out of memory...
In Java NIO.2, directories function to return an iterator to allow for greater scaling. The directory stream class is an object to iterate over the entries in a directory. It returns a stream of entries that represents each file in the directory. When the action is complete, the developer closes the stream. The stream's close method must be invoked to close the stream.
Performing operations on files is made easier by the file visitor API:
If you provide a starting point and a file visitor, it will invoke various methods on the file visitor as it walks through the file in the file tree. We expect people to use this if they are developing a recursive copy, a recursive move, a recursive delete, or a recursive operation that sets permissions or performs another operation on each of the files..
In addition, symbolic links are also finally supported on operating systems that offer such links:
The java.nio.file API has full support for symbolic links based on the long-standing semantics of UNIX symbolic links -- something that Java developers have long requested. This works on Windows Vista and newer Windows operating systems as well. By default, symbolic links are followed with a couple of exceptions, such as move and delete. In a few cases, the application can specify an option to follow or not follow links. This is important when reading file attributes or walking file trees, for example.
Another new feature gives developers the ability to listen for changes occurring on the file system. That alleviates the need for file system polling:
The main goal here is to help with performance issues in applications that are currently forced to poll the file system. This midlevel API is relatively easy to customize and build on. Developers can use it as is or create a high-level API on top of it to suit their needs... Most file system implementations have native support for file change notification -- the WatchService API takes advantage of this where available. But when a file system does not support this mechanism, the watch service will poll the file system, waiting for events.
What do you think of the NIO.2 changes for filesystem access?
Sun should, at a minimum, create an interface based on the <pre>java.io.file.Path</pre> class than can be used to mock out or extend the file system easily, and encourage people to write code against that interface, rather than the <pre>Path</pre> object.
I emailed the JSR lead a couple of years ago on this, and he assured me that the entire rewrite would be done with an eye on interfaces. Looks like that hasn't really happened, unfortunately.
Well, I spoke a little too soon. Turns out Path is abstract, and all its members are abstract too. I still wish it was an interface, so you didn't have to commit to an inheritance hierarchy when implementing it, but it's still an infinite improvement over File.
A Path is either absolute or relative. An absolute path is complete in that does not need to be combined with another path in order to locate a file. A relative path must be combined with other information in order to locate a file.
I'm not entirely sure I'm lucky with this definition. "Relative path" means relative-to-something and in this case it is a file or directory F. One can find paths relative to F using the operations parent() and join(...). If F is a file and not a directory than join(x) is always empty for each path component x. A root node is characterized by parent() being empty. With parent() and join() one can reach all paths in a connection component of a file system. "Absolute paths" are defined intrinsically as relative paths to root nodes and are just special cases.
Syntactically parent operations are often represented by leading dots. So .F is the parent of F and (.F).G would by a child of F's parent. One could also write this as:
F.parent().join(G)
If one has an abstract and conceptually correct notion of a path one can map it onto package/class import paths, file-system paths, paths in jar-files etc. They are all different representations of the same underlying path concept.