Summary
In a recent tutorial, Lorenzo Puccetti proposes a technique for event dispatching in Java, and provides an overview of concurrency-based event processing.
Advertisement
Event-driven programming and concurrency are not always thought of as relating to the same problem, namely, scaling up system throughput. In a recent tutorial, Towards a Timely, Well-Balanced, Event-Driven Architecture, Lorenzo Puccetti illustrates how event-based programming and concurrency can be used to effectively scale a system. Noting that:
If we could assume that any type of event is consumed as fast as it can be produced, there would be no need to separate "event submission" from "event handling." In fact, it is the inability to recognize this need that leads to bottlenecks that become only visible when the system is tested under a heavy load.
In order to provide such a scalable solution, Puccetti proposes:
An implementation [that] translates calls from dispatchers into timely, well-regulated calls to the receiver, so that if the dispatcher's push method is called several times in quick succession, it translates into just one call to the receiver. We call this implementation the TimelyDispatcher...
An embedded thread is responsible for monitoring the activity of the push method. When the method is invoked, we register the time of the call and update a thread-safe queue. We use a LinkedBlockingQueue so that as items are added to it, the embedded thread can take them off and store them into a private (thread-confined) list. This list's lifecycle is add-add-add-... , snapshot (toArray), and empty...
The embedded thread encapsulates the logic that determines when to invoke the receiver's received method by checking on the current nano time against the values stored in the atomic timeLastAction and thread-private timeLastNotification variables.
Puccetti notes this event-based approach differs from a more traditional consumer-producer pattern:
In the Producer/Consumer pattern, events represent self-contained units of work (tasks) that can be tackled by any consumer. In its simplest forms, scalability is achieved by adding more consumers. The developer's task is therefore to identify highly homogeneous, parallelizable activities. On the other hand, in the dispatcher/receiver framework, events represent the trigger for generally expensive activities to be performed. The time to perform those activities cannot be reduced and the separation between event production and consumption is there to prevent requests from executing more activities than the JVM would be able to sustain. In a way, the two approaches are orthogonal to each other, and in some circumstances it is also possible to combine them together--for example, the same class could act as receiver and producer at the same time.
What do you think of Puccetti's approach to concurrent event handling?
I think that the approach is good and can improve scaling.
In a wider context, I think that an object-oriented programming language that uses solely the Actor model could be a win for everyone: parallelization would be automatic (whatever could be parallelized, it would be parallelized), and scaling could be achieved by simply adding more CPUs.
> > there is no need to go any further. systems dealing with > one event at a time are entirely broken. no matter how you > decorate this fact.
Can you explain how you get that this only handles one event at a time by looking at an interface? And what would the interface look like if it did handle more than one event at a time?
> > Can you explain how you get that this only handles one > event at a time by looking at an interface? And what > would the interface look like if it did handle more than > one event at a time?
how many times this method will be called to push 10 events? what about 100, 1000, or 1000000 events per second? how many objects to be created? this approach doesn't scale. Either interface method lookup time, or object creation (new T()) becomes a bottleneck.
I would take a look a bit further had this example the following signature
publicvoid push(EventIterator it)
such approach put less restrictions on implementation and allows batched event processing.
> how many times this method will be called to push 10 > events? what about 100, 1000, or 1000000 events per > second? how many objects to be created? this approach > doesn't scale. Either interface method lookup time, or > object creation (new T()) becomes a bottleneck. > > I would take a look a bit further had this example the > following signature > > public void push(EventIterator it)
Wouldn't that push the 'interface method lookup time' (which is measured in fractions of nanoseconds) to the calls to the event iterator? Wouldn't it also make the iterator the bottleneck?
An iterator isn't a bad idea but I don't see how it resolves the issues you claim it does.
I guess you should have read the article a bit more carefully before writing such a bold statement.
The author clearly states:
<I> Attentive readers may argue that these interfaces do not capture the problem in its entirety. In particular, the Dispatcher interface could provide a few other signatures of the push method to allow atomic "push multiple" functionality as well as lifecycle functionality. For instance, termination could be managed via a few shutdown methods as in the java.util.concurrent.ExecutorService interface. In the interest of simplicity, we omitted this aspect. </I>
<br/>
Why are so many people in I.T. so quick to jump to conclusions ?
Using arraylist is surely much better than using collection. But passing events in arrraylist or collection creates hidden restriction -- you have to actually create those events... passing iterator has the following advantages: 1. you dont have to create objects (supppose array of events can be represented as array of ints internaly. and iterator just provides access to event fields and promote to next event using next() call.
this is a considerable benifit as in most cases I was dealing with particular consumer doesn't need all the events pushed to it, but rather use subset of events he is interested in.
2. passing araylist also means that consumer may do anything to passed collection save refernces to particular events modify collection and so on.
so passing iterator has much more impact than it look for the first sight.
Actually using concrete implementations such as ArrayList in an interface isn't such a good idea. Using List or Collection is much better. This let's the caller decide what kind of list or collection suits the need best.
> Actually using concrete implementations such as ArrayList > in an interface isn't such a good idea. Using List or > Collection is much better. This let's the caller decide > what kind of list or collection suits the need best.
Unfortunately it's not always true... while I'm agree with programming against interfaces concept. You have to be aware that interface calls are significantly slower in java than class calls. It has huge impact if you have to deal with millions of such calls per second.
passing arraylist does to things: 1. it's a concrete class so calls to it's methods are faster 2. you don't need to create Iterator to iterate over it, but can use plain old for instead (avoiding creation of unnecessary object)
as I wrote it has some drawbacks like: 1. you have to create all the events placed to the list, while passing iterator allows you to use flyweight pattern or its variants. 2. it does allow (potentially) to modify your list or keep references to particular events.
Have to say that it's only an issue if you have to deal with really huge volume of events like market data or alike.
If you are thinking of API like restrictions put on implementor side you quickly come to iterator approach as the only solution :). As the only functionality (restriction) exposed by iterator is, well, ability to iterate over bunch of data :))
> The author clearly states: > > <I> > Attentive readers may argue that these interfaces do not > capture the problem in its entirety. In particular, the > Dispatcher interface could provide a few other signatures > of the push method to allow atomic "push multiple" > functionality as well as lifecycle functionality. For > instance, termination could be managed via a few shutdown > methods as in the java.util.concurrent.ExecutorService > interface. In the interest of simplicity, we omitted this > aspect. > </I> > > <br/> > > Why are so many people in I.T. so quick to jump to > conclusions ?
Well, probably because code fragment was highlighted in bold unlike the rest of the article which is lengthly enough. :)
> Unfortunately it's not always true... while I'm agree with > programming against interfaces concept. You have to be > aware that interface calls are significantly slower in > java than class calls. It has huge impact if you have to > deal with millions of such calls per second.
Can you back this up with anything? It might have been true 10 years ago but I don't think it is now.
Significant? I run the following test with a parameter of 100 executions and I get a per method call difference of less than on nanosecond. That means for a million calls, you are talking about a difference of less than one millisecond.
I don't see that as very significant.
java version "1.6.0" Java(TM) SE Runtime Environment (build 1.6.0-b105) Java HotSpot(TM) Client VM (build 1.6.0-b105, mixed mode)
AMD Turion 64 Mobile 1.99 GHz Windows XP SP2
import java.io.*;
import java.util.*;
publicclass Test
{
publicstaticfinalint INNER = 1000000;
staticfinal Time time = new ConcreteTime();
staticfinal ConcreteTime concrete = new ConcreteTime();
publicstaticvoid main(String[] args) throws IOException
{
long timeA = 0;
long timeB = 0;
int tests = Integer.valueOf(args[0]);
for (int i = 0; i < tests; i++) {
if (i % 2 == 0) {
timeA += test('A');
timeB += test('B');
} else {
timeB += test('B');
timeA += test('A');
}
}
System.err.println("A: " + timeA);
System.err.println("B: " + timeB);
long diff = timeA - timeB;
double diffPerTest = ((double) diff) / (INNER * tests);
System.err.println("difference: " + diff + " nanoseconds");
System.err.println("difference per test: " + diffPerTest + " nanoseconds");
System.err.println("percent: " + (((double) diff / (double) timeB)) * 100);
}
staticlong test(char test) throws IOException {
long time = System.nanoTime();
long l = 0;
for (int j = 0; j < INNER; j++) {
switch (test) {
case 'A':
l += testA();
break;
case 'B':
l += testB();
break;
}
}
System.out.println(l);
time = System.nanoTime() - time;
if (time < 10) thrownew RuntimeException("INNER too small");
return time;
}
publicstaticlong testA() throws IOException
{
return time.getTime();
}
publicstaticlong testB() throws IOException
{
return concrete.getTime();
}
}
interface Time
{
long getTime();
}
class ConcreteTime implements Time
{
publiclong getTime()
{
return System.currentTimeMillis();
}
}
> > Unfortunately it's not always true... while I'm agree > with > > programming against interfaces concept. You have to be > > aware that interface calls are significantly slower in > > java than class calls. It has huge impact if you have > to > > deal with millions of such calls per second. > > Can you back this up with anything? It might have been > true 10 years ago but I don't think it is now.
Your example is nice :) but did you aware that if you have just *one* implementation of an interface loaded by VM hotspot just inlines it? It does so because there is *significant* difference in class method lookup and interface method lookup.
And let's do not start yet another microbench flame, ok? ;)
Adding another implementation did not really change the results. I'm pretty familiar with the affect of interfaces in Java and there is no noticeable performance impact.
And let's not post unsubstantiated claims, ok? ;)
Feel free to post an example of the performance problems that interfaces cause or some reference that backs up your claim.
Flat View: This topic has 24 replies
on 2 pages
[
12
|
»
]