Summary
JavaSpaces, a Java-centric implementation of the tuplespace paradigm, and has been available as part of Sun's Jini Starter Kit for many years. Yet, few developers are familiar with JavaSpaces' simplicity and elegance in building distributed systems, a lack Joseph Ottinger aims to rectify in an introductory JavaSpaces tutorial on TheServerSide.com.
Advertisement
JavaSpaces, a Java-centric implementation of David Gelernter's tuplespaces concept, has been part of the Jini infrastructure since 1999. Yet, according to Joseph Ottinger, JavaSpaces is,
One of those technologies that programmers know is out there, but haven't actually used enough to say they understand what it's for or what it can do for them.
To rectify this lack of familiarity, Ottinger penned an introductory JavaSpaces tutorial for TheServerSide.com, Using JavaSpaces. The first part of his tutorial explains the JavaSpace interface, the focal point of all JavaSpaces operations:
There are five basic operations associated with JavaSpaces. They are:
Read an entry matching a template, leaving it in the JavaSpace
Take an entry matching a template from the space, removing it from the JavaSpace
Write an entry into the JavaSpace
Register a callback for events in the JavaSpace
Issue an event in the JavaSpace
Of these, each can be associated with a Transaction, and the read/take operations can also block, wait for matching entries, or readIfExists(), which will block if there's an entry that might become available to satisfy the operation; otherwise it returns... takeIfExists() is also in the API...
if JavaSpace05 is used, the take() operation can populate a list of entries matching the template, for bulk processing of entries in the JavaSpace, as can the write operation ... for writing sets of data. JavaSpace05 also includes a way to get references to sets of information (the contents() method.)
Ottinger notes in his article that the default JavaSpaces implementation bundled in Sun's Jini Starter Kit—that that company recently open-sourced in the Apache River project—does not provide the easiest out-of-box experience. Instead, third-party open- and closed-source solutions are also available:
In addition to the Jini starter kit not being documented very well, Outrigger isn't especially good either, so an easy and convenient addition to the Jini download is Blitz. Blitz replaces Outrigger as well as provides some easy startup scripts and some diagnostic tools. A commercial implementation with the ability to distribute the Space is GigaSpaces.
The article's concluding segment is a tutorial on developing a generic compute server in JavaSpaces.
I've been working with JavaSpaces on and off for the last 12 months and I simply love it. We have been designing an OLAP solution with JavaSpaces. We had to deal with large volume of data and high performance expectations and come out with a powerful solution where parallel computations are the core of the architecture... These days I could not think of solving the same business problem in any other way.
Why do you think JavaSpaces is not more popular as a tool for building high-performance enterprise apps?
Is the concept of tuple spaces of any use to real world applications? the idea of mutual exclusion creates the same problems as row locking in databases...in a distributed environment, programs might or might not run, so grabbing and releasing objects is a dangerous idea.
That concept is just like any other: you can put it to good use. At least, that's what Lorenzo Puccetti and I think. Both of us seem to have successfully developed a distributed system where a single application could not have timely coped with the amount of data and calculations. I am afraid, I can't make much of your last sentences. Locking problems exist in any application that accesses a shared pool of resources - and there are ways to solve these problems. Also, I don't quite follow the causality of "programs might or might not run, so grabbing and releasing objects is a dangerous idea".
I beleive that the lack of advertising JINI and promoting J2EE instead has had much to do with it taking a niche role.
This gave rise to a "vicious cirlce": soon every business needed to have a (full blown, heavy weight) J2EE server for that was the buzz of the moment. To maintain projects that were built on that technology they needed ... J2EE developers who would build J2EE applications.
A spin off was the little on-line documentation. There have really only been two indepth resources on how to develop applications in JINI: Jan Newmarch and Artima.
> That concept is just like any other: you can put it to > good use. At least, that's what Lorenzo Puccetti and I > think. Both of us seem to have successfully developed a > distributed system where a single application could not > have timely coped with the amount of data and > calculations. > I am afraid, I can't make much of your last sentences. > Locking problems exist in any application that accesses a > shared pool of resources - and there are ways to solve > these problems. > Also, I don't quite follow the causality of "programs > might or might not run, so grabbing and releasing objects > is a dangerous idea".
I meant that applications may crush while having locked a resource.
Personally I am against locking. I prefer optimistic updates using the versioning trick. It allows users to check out the same data, and whoever gets to check the data in first is the happy one; the other one is forced to reload the form. And this scheme works quite well in distributed apps, without needs for timeouts and wakeup threads.
Of course it all depends on the requirements. Sometimes the above solution might not be the good one.
Well, this is exactly what transactions can be used for. If a result does not come back in time (transaction times out), the associated task becomes available again and a different agent can handle the task.
The approach you describe is - forgive my being forthright - totally unacceptable. ;-)
1. Of course, you also have housekeeping. The consumer of the result must somehow assert that the result has not already been provided. So you haven't gained anything on that front.
2. But worse, you tie up your most valuable resources: your agents! If the task is visible for every agent, they will grab it. Only to produce the same result over and over agein, where instead they could have completed other tasks.
As such I am tempted to argue that this approach is never "the good one".
> Ah, I see. > > Well, this is exactly what transactions can be used for. > If a result does not come back in time (transaction times > out), the associated task becomes available again and a > different agent can handle the task.
I suppose that the server is responsible for managing the transaction timeout. But I am talking about the client crashing, not the server. For every check out the client does, the server must keep a timeout event. This work is unnecessary with the mechanism I described.
> > The approach you describe is - forgive my being forthright > - totally unacceptable. ;-)
Are we actually on the same track here? I am talking about a persistence layer using Javaspaces, i.e. DB access.
> > 1. Of course, you also have housekeeping. > The consumer of the result must somehow assert that the > result has not already been provided. So you haven't > gained anything on that front.
No, no bookkeeping is required. The whole mechanism works at the transaction level: a record update can only take place only when the right version of data are in the database. The library that gathers together all updated data into a transaction automatically inserts the appropriate SQL for version checking. If one version check fails, the transaction is rolled back.
On the other hand, If the DB records were locked, there might be big problems.
> > 2. But worse, you tie up your most valuable resources: > your agents! > If the task is visible for every agent, they will grab it. > Only to produce the same result over and over again, where > instead they could have completed other tasks.
Not really. In web applications, you are not sure when a job will be completed. For example, if I have a hotel with 10 rooms, and 9 rooms are already booked, I do not want a client to lock the free room record, because I do not know when and if the client will submit the booking. The first person who submits the booking will be the one that will get the room.
> > As such I am tempted to argue that this approach is > never "the good one".
It has worked amazingly well so far.
But perhaps we are talking about different things here...
> Not really. In web applications, you are not sure when a > job will be completed. For example, if I have a hotel with > 10 rooms, and 9 rooms are already booked, I do not want a > client to lock the free room record, because I do not know > when and if the client will submit the booking. The first > person who submits the booking will be the one that will > get the room.
That is not what JavaSpaces transactions are for. They are conceptually the equivalent of Java's "synchronized" capability. You would never hold a "synchronized" lock on a hotel room object in a Java web application between two web pages, right? Well, you'd never hold a JavaSpaces transaction open either.