This post originated from an RSS feed registered with Agile Buzz
by James Robertson.
Original Post: Opentalk Load Balancing
Feed Title: Cincom Smalltalk Blog - Smalltalk with Rants
Feed URL: http://www.cincomsmalltalk.com/rssBlog/rssBlogView.xml
Feed Description: James Robertson comments on Cincom Smalltalk, the Smalltalk development community, and IT trends and issues in general.
Len Lutomski talks about load balancing and Opentalk. Starting out with a demo - the demo is running:
On a single host, multiple images (simplicity of showing it, not a limitation)
Test Coordinator running distributed SUnits
A Server Monitor that shows the message backlog
There are a number of load balancing algorithms implemented - one thing to notice at start is that things start out a little slowly as the various images negotiate their roles and set up. For the same reason, shutdown can take a few moments as well. The fist demo - a sequential round robin example. The monitor is showing an even load across all the systems as they run - exactly the way you would expect a round robin to run. Question from Len - which will distribute load better - a random round robin, or a sequential? Something for us to think about as he gets ready to run. The sequential distributed pretty evenly. There are some useful statistics available from the clients and servers - how long (aggregate) requests queued up, etc. Random round robin does worse, as the random assignment sometimes tosses "too much" load at a single server. We are actually seeing that in the monitor.
Next demonstration - "least loaded" scheme. As opposed to the round robin scheme, we need stats from each server about their load - so that we can make a determination. "Least Loaded" tends to drop load on a given server in sequence until the data about server load is updated. So a given server is in the "give me load" state past the point where it's no longer the least loaded system - we are making determinations based on old data. I've seen this personally - I implemented (with Sean Glazier) a simple load balancer for VisualWave years ago - the customer wanted least loaded, and we saw exactly what was just demonstrated. They ended up using our simpler round robin scheme :)
Now Len's showing us some kind of request stream that - using sequential round robin - tends to load down individual servers. Now he's going to use the same stream of data, but a random round robin. This is tending to work out better in terms of overall load - individual servers are not getting bogged. Now on to "Least Loaded" - This works out better than the round robin. (In reality, you should run lots of tests). The difference here is that the time to run the requests varies - instead of each being the same, each varies between short and long. Under that test, least loaded does ok. Now a round robin where the interval to switch servers has been set longer. This does tend to load a given server every so often, but in general distributes the load pretty well. By playing with the time interval to switch servers, you can get better (or worse) performance. There are various schemes for Least Loaded that back off - for instance:
take in load data
on request, pick lower half of group (on load)
assign request randomly in that group
Works better than least loaded by itself, but does add complexity. Not only does it work, but it's fun to play with! Now back to the presentation
Load balancing is for syncronous, connection oriented protocols. For Multi-cast, you do something else. Some routers do load balancing - they are faster, but usually less flexible than software balancers. What about things (like web requests) that need to "pin" a server for session oriented requests? You have the balancer work only on initial requests, not on the rest. Why do we do this at all? We do this in order to minimize the time that clients wait on a request. There is overhead to this.
You really need to look at how long your requests tend to take in order to set up an optimal number of servers to use in a balancer - bearing in mind that the balancing itself has overhead (more network traffic, processing time for the decisions, etc):
Redirection overhead
Messaging overhead
Hindered server and balancer responsiveness
Opentalk balancer - released as of 7.3. Four parcels, a framework built on Opentalk. And it's dcumented! There's an API defined for configuring and creating instances of components. Basic support for multiple balancer architecture, and generic clients and servers. Working on the failover and state replication issues.
Supported architectures - Again, not dealing with multiple balancers. The basic assumption is that all servers will deal with the same kinds of requests, and that all servers are the same kinds of systems. You should not use an architecture with sessions or transactions - i.e., any situation where you expect any maintenance of client specific, server side state between consecutive requests. So:
balancer always, no loads
client reference wrapper only holds a balancer reference. Goes to balancer for each request. balancer has a static distribution policy. Servers do not have co-located load monitors (not needed). Simple strategy, expensive in redirection overhead, cheap in messaging. Works well with no sessions or transactions
balancer rarely, no loads
balancer always, with loads
balancer rarely, with loads
balancer rarely, with loads
no balancer
And we are out of time! Lots of good info on balancer strategies.