Summary
Java is in need of a sort of "Grand Unified Theory" for distributed applications. EJB, Jini, JXTA, and JMS have more in common than just the letter J, though you wouldn't know it from reading the specs.
Advertisement
So what do you mean Grand Unified Theory?
Last week, for the first time since 1998, I made it out to San Francisco for JavaONE. The intervening years have seen the crest of the tech wave that had only been building back in 1997 when I attended my first J1. The changed atmosphere of this edition produced a vague feeling that there was some thread running through the whole thing that I just wasn't picking out. Then, while listening intently to a conference session concerning a JSR with which I was totally unfamiliar, it dawned on me that my unfamiliarity with the topic at hand was itself the key factor.
Back in '98, I felt that I understood Java as a technology. There was no session that year that I could attend where I didn't have at least a passing familiarity with the topic and some exposure to the API involved - and I didn't consider myself at all special in that regard. JavaDoc and O'Reilly books on Java were just the stuff you read in your spare time if you were a geek. But things have changed since then.
There are now so many pieces of the Java specification that it is entirely possible to work heavily with one API and still be completely incapable of explaining how to relate that API to another one which has some overlapping features. Don't believe me? Try this - in one nice, neat list describe the decision tree you'd use to determine whether to house an object in a distributed application in a) an EJB container, b) a JXTA peer group, c) a Jini service or d) broadcast it in a JMS topic. I've certainly tried it and the result somehow left me feeling stupid.
My thesis for the next few posts to this blog will be that Java is in need of a Grand Unified Theory for distributed applications that would enable me to write just such a list.
Let me assure you that I don't have the elements of the list predetermined and that I'd greatly appreciate feedback from the community on what I'm missing. (I should note that I derived a great deal of guilty satisfaction recently from hearing some of the people who wrote the APIs mentioned above say that they really weren't able to make such a list either ).
Sounds pretty vague to me
This particular musing is not a hazy abstraction for me. It is directly related to problems I face in a current project and that I've simply got to solve in the coming months.
I'm building out a compute farm for a coarse-grain, numerically intensive problem that has got to dramatically scale up. ( I need at least a 50X increase in throughput over the current multi-threaded app). A distributed technology that allows me to quickly add new hardware at near-linear cost per compute cycle is the only way I see that we can make that happen. So I hope to be mixing blades, 8-way symmetric servers and maybe some desktops into the computational mix, scattering those resources across several different locations and trying to do it all on the cheap.
I've already made certain technology choices, so I can at least begin to fill out my decision tree. Configuration, control and logging of the entire beast is going to be vested in objects living in a J2EE container, so that I can take advantage of JMX. Moving the computational tasks around is going to be a Jini/JavaSpaces responsibility. (Marrying those two technologies in a reasonable way is the subject of much hacking at present.) JXTA offers some interesting notions that I intend to explore for sharing spaces across sites. And finally, JMS is of interest because there could be real-time data which every computation will need to be aware of. How all of this eventually will shake out is still somewhat mysterious to me, though.
Okay, so its one interesting facet of a tiny grain of sand on the computing beach
I've read Jim Waldo's posts to this site with interest, because I agree with him that a) this sort of scalable, distributed application is going to a Very Important Trend in computing and b) this will require some form of mobile code. I've certainly mistaken the problem I happen to be working on for the Gulf Stream of computing currents before, so I could be way off here - but if I am, this is still the most fun I've had writing code in a while.
Hopefully I can get some more details and thoughts posted in the next week or two. In the meantime, I look forward to hearing the thoughts of others on unifying Java distributed technologies into a broader framework.
In one nice, neat list describe the decision tree you'd use to determine whether to house an object in a distributed application in
Try this decision procedure: "Does it leand itself to a lightweight, agile development process, or does it force me to deal with a bloated API or framework?"
Vapourware? I don't think so. It looks VERY exciting. Enterprise development is going to be a different ballgame in a few years, and one I'm looking forward to playing.
From http://research.sun.com/projects/ace/ By removing the distribution details from the application specification language, and then adding those details back automatically when the application is deployed, we allow the application writer to concentrate on domain details specific to the application's purpose, instead of worrying about middleware APIs, remote object invocations, and other details of the implementation "stack".
Oh good. More "don't worry, we'll shield you from the actual details so you can script our framework." No thank you.
My intent actually was not to elaborate a vision for the future but to figure out how to use what's out there now. For example, J2EE, Jini, and JXTA all have different incompatible discovery mechanisms, each designed to deal with a kind of distributed computing problem. When should I use one over the other?
I have some THEORIES on this that I will discuss over time. That statement lent itself to a cute title. Probably too cute.
Perhaps taking a look at the language and seeing whether it stands on its own as a language would be more appropriate than thinking of it as scripting a framework.
As far as I'm concerned, dealing with EJB and 20 other APIs that are way out of the domain logic falls into the category of incidental complexity and is unproductive. I don't want to create all the overhead code for EJBs or solve vendor differences when programming in SQL or solve relational mapping crap or figure out how to distribute code. (Though I do all of that and it keeps me employed.)
All that incidental work is a waste of money, and a waste of my life. These problems should be solved once. Period. End of story. Someone is going to fix this popularly -- perhaps the Ace project, perhaps another project -- and the savings in labor will be so dramatic that the world will have to take notice.
I think ACE is quite unrelated to the question at hand of needing a decision-tree to choose among technologies of distributed computing. It's more along the line of MDA and generative programming it seems.
If you want to use a dynamic set of networked computers as a unified computational resource I'd concentrate on one of the Grid initiatives. Sun's got something called gridware and http://gridengine.sunsource.net/ IBM seems to be getting into gridcomputing quite heavily: http://www.ibm.com/developerworks/grid/
You've gotten my interest sufficiently piqued that I'm looking forward to your future articles. I know they say you can't rush genius, but ... bring them on!
Perhaps taking a look at the language and seeing whether it stands on its own as a language would be more appropriate than thinking of it as scripting a framework.
Haven't heard back yet on the beta program, so I haven't been able to download anything yet. I'm willing to give it a shot, but given that this is from the same company that gave us J2EE (not to mention Java, though the language at least has some worthwhile qualities ) I'm not wildly optimistic. Plus, Ace is built as an add-on module for Sun ONE Studio 4 update 1, Enterprise Edition for Java, so is it really a new language?
Sounds like you are makign similar design choices to my gameGO project..
The difference is that my System is setup for multiplayer mobile games on j2me devices adn thus my computational loads come from running conroller clones of game son the server side and delivery mulitplayer services via jini..
Have you come up with a mobile math set of rule in code that allows the load to be move automatically no matter what stubs you call? According to another article on artima..that is what Macromedia uses in JRun via jini to cluste the load..
and what performance improvements are you finding of using jini-javaspaces as the cluster compared to javagroups clustering api used in JBoss?
This subject is probably the most important one since the creation of Java and it deserves a long-term forum, active for many years to come.
My first comment: the poor state of affairs for distributed apps is not caused by or is not due to the nature of the Java language or the api's or the specs, but rather it is a reflection of the current state of the art in the entire computer technology. We are just at the beginning of the age of *the system is a network* and we have a long way to go. I have been working on tools for improving this for at least 13 years when I created a generic graphical event-driven modelling and development system, and now I am writing an open-source experimental api that uses tcp sockets and xml.
We need new ways of representing events and interacting parallel (distributed) processes. The current linear text-based languages are just not enough.
I see 3D modelling and dynamic simulation of systems in development as candidates for future solutions.
> Sounds like you are makign similar design choices to my > gameGO project.. >
I looked at your site. That statement seems conceivable. I'm going out of my way not to claim any dramatic original insights here. :^)
> The difference is that my System is setup for multiplayer > mobile games on j2me devices adn thus my computational > loads come from running conroller clones of game son the > server side and delivery mulitplayer services via jini.. >
I have a similar problem. One of the issues I keep running into is: "where does one insert html/xml generation in a distributed application". There aren't good answers to this one in Jini, JXTA or JMS. EJB deals with this nicely, but doesn't offer the massive parallelism I'm looking for.
> Have you come up with a mobile math set of rule in code > that allows the load to be move automatically no matter > what stubs you call? According to another article on > artima..that is what Macromedia uses in JRun via jini to > cluste the load.. >
Somewhat, at present we're going with a space-based (as in Java Spaces) mechanism and having the space workers only request work if they have resources available to complete it.
> and what performance improvements are you finding of using > jini-javaspaces as the cluster compared to javagroups > clustering api used in JBoss?
Haven't tried clustering 50+ boxes on JBoss. With Jini, that's easy (to configure, anyway).
I plan to write more detailed thoughts tomorrow night sitting on a plane and post those sometime Thursday.
As evolution of software engineering continues, it seems natural that we converge towards more abstract, high level mechanisms to develop systems.
Java has advanced a long way from assembly. J2ee is yet another evolutionary step which attempts to provide a higher level framwework, in which to build distributed systems ( more specifically client-server web-based).
What I find interesting is that generic distributed system architecture approaches have been attempted for years, long before java and j2ee - and it remains untackled. J2ee is a culmination of punctuated evolution to satisfy the needs of the webserver model.
Jxta, Jini seem more appropriate for generic distributed systems.
I think that mobile agent systems are theoretically, the most appropriate architecture for the "system = network" vision. Such an architecture is yet to be realised in a global sense.
I feel like a lot of confusion arises from the fact that the problem statement for distributed computing varies from one set of specifications to another.
Some specification sets cover distributed computing but it's not their core concern.
J2EE, for example, claims to be targetted towards distributed computing and yet, for me, seems more about integration of systems - witness the toolbox of specifications - sure remoteness is part of it but it's not the be all and end all.
RMI is a distributed computing platform in some ways.
JINI is another which can use RMI (but doesn't have to) and goes much further towards helping a programmer handle all those nasty failure cases which seem to be brushed under the carpet by so many other "distributed computing solutions".