|
Re: How to think like a Jinition
|
Posted: Jan 14, 2005 10:14 AM
|
|
> The problems I have run into with Jini (really JavaSpace) > are: > > 1. Jini 2.0 is too hard to setup and get the sample > projects running. Yet, I dont want to stay on Jini 1.2.1. > THat has really taken alot of the wind out of my sails. It > just is too hard for me to work with. Jini 1.2.1 works > great.
Humm. I do see a lot of questions on the Jini-users list about getting 2.0 running. But, I am not sure what the total issue is. I use the ServiceStarter framework in 2.0 and it works just fine for me. I can trivially (for me) add or remove services from the configuration, switch out endpoints etc.
Tell me more about the specific issues that you've encountered. The community, overall, wants to help make Jini easier to get started with. But, there are numerous concepts that one has to get their thinking in line with. > 2. JavaSpaces is dog slow. Have you ever ran real > benchmarks inserting and taking empty enities? It really > sucks. It tried v2.0, its still orders of magnitute slower > than using sockets or a database. And much slower than > EJB.
I think that this is an issue that a lot of people run up against when they try to use a Javaspace like an RDBMS. The use of Javaspaces is really as a short term asynchronous buffer. It is tailored and designed to not deal with an extraordinary number of Entry objects. Also, the bandwidth through the space needs to be metered. But, this is try of any distributed system.
In J2EE and other single server approaches, the load on the server will limit the speed at which additional work can enter. So, there is a somewhat natural balancing of load that can occur. Though, you can still have poor balancing due to cheap inbound work compared to expensive outbound work.
Using acknowledgements and other out of band information to control the rate that work is presented to the system can provide a much better opportunity for success in distributed system design.
Any distributed system really must be able of balancing its activities against available resources in a deterministic way. There are some very direct ways to deny access to an overloaded part of the system and to stage data in a way that throttles the system.
In my broker, I use a disk based queue balanced by a per queue thread limit. What this allows is for the normal latency in the system to throttle my access to the destination servers. But, also increasing the thread count can increase system throughput by letting additional threads be processing on either end of the network while data is in transit.
So, on any system that the broker is installed, we only adjust the number of threads processing in a queue to balance the system. Bursty inbound data, is archived to disk. Stuck data that can't be delivered is left on disk, and placed into an alternate part of the queue since it is now old data.
> 3. To effectively use JavaSpaces in a "real" production > application you need a lot more real pieces. Operations > (Help Desk) needs to be able to support Jini. So, YOU the > e developer end up writing all these little utilities to > monitor the server or to start restart services. In Jini > 2.0 the StartService utility is gone and replaced with > scripts. The problem I ran into was I had to write so much > utility tools to monitor, I couldnt make progress on the > application.
Are you aware of the configuration based startup and the ServiceStarter framework? This makes it possible to start all of the JTSK services in a single batch file/shell script that doesn't have to change much. The hello world stuff in Jini2.0 is pretty detailed in how to use this stuff.
> 4. Jini is so unreliable. Try running the sample ray > trace app and kill the Space (service) and restart. THe > clients (workers) freak out and the dreaded stack trace > exceptions appear. The workers need to be restarted most > of the time.
It's not unreliable. You have to understand what it is trying to accomplish. It is not trying to provide a 10ms recovery of errors. It is trying to guarentee that long term failures are noticed and reported by the service not being visible in the LUS any longer. What is at issue is that the Sun teams experience with long term network behavior smacks in the face of what people want to experience based on their own local network experiences where CPUs are never overloaded and bandwidth is always available.
In large scale server environments, you have to be willing to wait for real errors. The TCP spec suggests times as long as 5min for timeouts in connection shutdown. These timeouts are part of the history of TCP and its development in the time of .75mip machines running over 2400-9600 baud dialup slip links or at best over DS0 or DS1 trunks.
So, while they are very conservative, they will eventually provide the desired actions. The timers are configurable for most of the places where this occurs. There was an announcement on the porter.jini.org list today saying that they were going to issue a lease cancel() call when a service entry is removed from the LUS. Currently, the service entry is removed from the LUS, but the lease is not cancelled. Thus, this lease time is the upper limit on the time to failure detection.
> 5. Jini is unreliable (pt 2). Run the ray tracing app, > sometimes not all the blocks make it. (COuld be a bug in > the demo app). Makes you look like crap in front of your > bosses to see those missing pieces when you showcase the > app. I work in insurance and if these are claims.. oops > theres a check that didnt get sent out.
In any data routing application, you have to provide appropriate behavior for any guarantees the system must make. Requiring intervening nodes of a routing system to guarantee anything more than "Best Effort" raises the bar tremendously. If the data is that important, and you can't deal with out of order, out of date, sometime later delivery, then you need transactions from end to end.
On top of that, you'll need data archiving and backup on every node along the path. I've found that it is much easier to only have the originator and the final receiver aware of these needs, and they should have a mechanism to let the receiver know that data is missing and be able to identify to the sender which piece needs to be resent.
The result of such duplication is a form of offsite backup as well as a much easier way to reprocess or recover data after the fact.
> 6. Bottlenecks. The server the Space is running on only > goes so fast serving objects, you need mulitple servers > for producers or the space is starved and multiple > consumers to pull results from Space, bottleneck. > So we are up to: 2 CPUS for workers, 2 LUS/Space, 1 CPU > producer, 1 CPU consumer = 6 computers .. you need 2 > spaces fr performance and failover for a real app.
If your problem is simple enough to solve on a single CPU, then you probably don't need to build a distributed system. However if your problem is open ended in complexity and load, then it is probably a good idea to start with a distributed system.
> 7. Errors. They are really hard to debug. You must really > understand RemoteObject and RMI errors and know what is > going on. Not fun when you are in downtime. All apps have > downtime and severe crash.
The configuration of your system, your network, and your adminstrative practices all play a part in how you recover from a crash. RemoteObject and RMI semantics can be difficult to understand if you don't know what you are learning about. Here's my list of things that I think are important to know about RMI as implemented in Jini 2.0 and JERI in particular.
1. Remote objects have to have code somewhere to be deserialized and used. The codebase setting is what tells a receiving JVM where it might find any needed class file definitions. You have to understand ClassLoader and you have to understand why looking up the classloader chain first is bad for downloaded code. The PreferredClassLoader lets you say when you want the classloader to look up the chain first so that you can eliminate the ClassCastExceptions that happen when a client defines a class that is also resolved to the jar file that the service comes from. 2. Proxy classes are automatically generated for you, BUT proxy classes are not automatically used to replace Remote implementing classes. You have to explicitly export everything. This might seem like a real pain, but it actually provides a great opportunity for centralizing exporting. I use the factory pattern in my applications and create a export managing class that has static methods for exporting. This allows me to handle specific classes differently than I may have originally thought I would because all the exports happen in the same place instead of being strewn throughout the code. also, consider passing another parameter to the exporter that identifies the specfic source of the export in case you want to 'count' exports to know how many of which type of remote reference you are generating which can help you manage the load on a server by restricting clients.
There are some other things about RMI such as distributed garbage collection which you need to understand too. For instance in the above point #2, clients that come and go probably need to use DGC to reclaim local resources, or you'll need to use another leasing mechanism to trigger the unexport of the object and reclamation of other resources. This is not a simple subject.
> 8. Politics. I cant even talk my boss into letting me code > Java anymore. Management purposely bought 12 .Net licenses > even though we have 2 good Java devs and 0 .Net > experience. The reason: Operations cannot support Java > servers and "they will learn how to operate Microsoft > software easier"
The limitations of the .NET platform are the simplifying factors. Of course there are many issues about .NET which make it seem exciting. The WS-* standards make you feel vendor independent because you can use SOAP everywhere. But, that's not where you application is adding value. Your time investment and monetary commitment is to the programming platform. So, if you find you don't like that platform and can't continue down that path, then, you are no better off than if you had chosen Jini or CORBA or something else. You still get to recreate your value added software to run on a different platform.
> I am even Java certified (SCJP) and boss could care less. > At my Java developer job at Lexis Nexis, I tried to push > Jini and its too much to explain when everyone knows the > J2EE model. I tried to push Jini for HPC while in in grad > school.. "Too slow.. must use MPI" and have no apps to > show because MPI is so hard to program. Java can be hard > to sell is my point.
The primary issue for me is that people think the only value added for a piece of software is the speed at which it solves a problem. More efficient software keeps them from having to buy bigger hardware. But, J2EE and .NET both sway you towards single large processors because distributing is harder from the perspective of having to know IP addresses and having to implement leasing and having to do a lot of things that Jini already makes available in the APIs.
What Jini allows is for smaller, existing hardware to be put to use to solve problems in ways that might not be as obvious, nor as tractable as a single processor solution. But, in the end, because you distributed, you get automatic fail over potential as part of your architecture. You can just have a disk and a machine and be back up and running without anyone having to know that you had to switch machines out. They'll see the failure if you're running on a single machine, but if not...
> 9. In the enterprise environment, the data is already in > a DBMS. Its better just to use stored procs. Good SPs are > a second or so (sub second at Lexis), and database > operations are basically constant time O(1). Properly > indexed tables act like a big hashtable. To run a INSERT > INTO (SELECT ...) to dump rows in a table takes about the > same amount of time versus Jini O(n) + all space overhead. > Space gets much slower as fills. DBs cache the data. You > have to write your own cache mechanism in JavaSpace.
DO NOT USE A JAVASPACE IF YOU NEED AN RDBMS. This is an important thing to recognize. RDBMS like functionality can not be recreated in javaspaces without some compromise in performance due to the way that javaspaces accesses data. There are, of course, situations where javaspaces can provide the exact behaviour that an RDBMS can provide, but that is a limited set of behaviours.
> 10. "Just buy a faster box" Idiots I work with are not > distributed computing fans and believe in buying a faster > dell server.
See my discussion on point 8 above.
> 11. Jini strengths are not to be for database access. It > was designed for autonomous networking. As Keith Edwards > said in Core Jini, "plug it in and it just works..like a > phone"
The concepts of the Jini system are very much usable in a database access application. J2EE was initially targeted at database access mixed with web server applications. Overtime, new functionality in the WS-* standards will drive it to contain more and more support for other things. So, the appserver vendors will be selling these great big, huge complicated systems that will be expensive to develop and support. Their prices will go up. The value added features will further anchor customers with a particular vendor.
Opensource solutions such as JBoss and Apache tools can mediate some of the expense. But, I contend that the adoption of these technologies, blindly, will limit the types of problems that you can solve, and will delay introduction or movement to architectures such as Jini that will trivially solve integration and distributed computing.
> 12. Lack of data. To get to the point where you need that > much CPU to process something, its hard to find enough raw > data that is CPU bound, not DB bound, easily parallizible. > We process some large jobs at umr.com (insurance company) > so it runs for awhile. I dont have any processes that are > 100% CPU all the time. If so, I really would use something > faster such as C based MPI or Java + sockets or J2EE > clustering.
I'm not sure what issue you are talking about here.
> I could ramble on.. thanks for reading my verbage. YMMV, I > really tried to get Jini in-house and failed.
Please feel free to follow up here and/or on the Jini-users list. I think this is a great set of issues, so I am going to post this back on Jini-users to see if there are others who will share their experiences and comments.
|
|