Summary
In a recent blog post, Terracotta's Ari Zilka explains how visualization can help better understand concurrency problems.
Advertisement
Most developers are familiar with Java concurrency concepts, such as threads and synchronization. At the same time, many developers are not familiar enough with the daily use of these concepts, and the APIs that implement these concepts, to avoid concurrency errors, writes Ari Zilka in a recent blog post, Lock profiling across JVMs in a cluster:
I think what makes multi-threading difficult is not the concepts. Everyone I speak with who codes Java knows what a thread is and what synchronized{} is for sure. They do not know, however how to avoid deadlocks in their design nor how to track them down reliably. It is a familiarity thing more than a conceptual challenge.
Zilka writes that because people think in different ways, developers should choose tools that suite their thinking and conceptualization styles. Tools that aid in programming concurrent applications are no exception to this:
Some of think in visual terms and others in textual terms. 3-D vs. 2-D. Essentially, people think in different ways yet integrals [in mathematics] are orthogonal to all this. We can remember an integral as the area under a curve as defined by some equation simply because it is a concept. We cannot remember the integral of tan (x) dx because it is a rote-memorized formula. Only those of us who use such equations day to day can remember such things...
This is why we built visualization tools: to take the notion of locking, lock tuning (striping and the like), and deadlocks from an experienced technician's hands to a conceptual thing we can all work with.
In the remainder of his post, Zilka illustrates the process of visualizing the execution of a concurrent application running inside Terracotta, an open-source Java clustering tool.
What are your preferred ways of visualizing the execution of a concurrent application?
> In Java there is a ton of coverage in the press about > multi-threaded programming and how multi-core is taking us > in new uncharted directions. I think what makes > multi-threading difficult is not the concepts. Everyone I > speak with who codes Java knows what a thread is and what > synchronized{} is for sure. They do not know, however how > to avoid deadlocks in their design nor how to track them > down reliably. It is a familiarity thing more than a > conceptual challenge.
I'm not so sure about this. My rough guess is that maybe 10% of Java developers really understand what synchronized(x){} means.
I have a canned interview question:
If I have a string referenced by variable s and a block synchronized on s, can I call s.toString() from thread 2 while thread 1 is executing code inside the synchronized block?
I have yet to receive the correct answer. And when reviewing code that uses synchronization, I routinely see errors. Often the attempts at synchronization have no effect.
Personally, I think the synchronized paradigm, aside from being too simple, is flawed in most developers' intuitions suggest it does something other than what it really does. It would have been better to provide a separate primitive solely for synchronization, IMO.
I second the lack of universal understanding of the contracts of the concurrent primitives.
My answer to the question is:
If thread 2 does not contain synchronization on s then the JVM will allow toString (or anything else) to proceed while thread 1 is inside a block synchronized on s.
And this is likely to be safe since strings are immutable.
However I also find it challenging to visualize some concurrent behaviors, even in the context of a single internal understanding about the meaning of the primitives.
I certainly find many places in many good Java books where there are attempts at synchronization which I believe are unsafe. I don't really think it matters who is wrong, me or the author, the existence of the question is a serious problem.
I am trying to understand the focus on concurrency lately. Developer productivity and software reliability are difficult and critical issues. Software performance is a less critical and much more tractable issue. And we are sacrificing in areas that very important and challenging, for a tool to attack a simpler and less important problem.
I think the software community is making a big misstep.
> I second the lack of universal understanding of the > contracts of the concurrent primitives. > > My answer to the question is: > > If thread 2 does not contain synchronization on s then the > JVM will allow toString (or anything else) to > proceed while thread 1 is inside a block synchronized on > s. > > And this is likely to be safe since strings are > immutable.
That's correct but I always get the answer that the call to toString() will wait "because the object is locked." I ask this specific question because this misconception is so common.
But I think the construct synchronized(s) is a little flawed because it suggests that s "is being synchronized" when is fact there is no such feature in Java.
At this point, I just try to avoid designs that will require synchronization and when it can't be avoided, I keep it as simple as possible.
I'm a big believer in using queues. You pull an object off a queue, which, by convention means you own it, process it, then put it on a queue for the next guy to process it. Things get done in order (sometimes essential) and, if you follow convention that when you put it on a queue you are done with it (you wont twiddle it anymore) there are few synchronization issues. With the relatively new Executors stuff this becomes an even better approach.
And this, for me at least, is easy to visualize. I too find the low level locks hard to understand and get right. And in many cases, IMO, you don't need them if you can work at a higher level.
I always tended to use more immutables that coworkers, after reading JCiP I'm using even more. Almost every field starts out as protected final until I know otherwise. Sometimes I even make them all public final, though few seem to approve of this style, so eventually they'll change back to protected or private and I'll add conventional getters.
Setters, always rare for me, are now extremely rare. Alan Holub would be proud. :-)
> I always tended to use more immutables that coworkers, > after reading JCiP I'm using even more. Almost every > field starts out as protected final until I know > otherwise. Sometimes I even make them all public final, > though few seem to approve of this style, so eventually > they'll change back to protected or private and I'll add > conventional getters.
I pretty much always default to private final. The only real argument for always providing a getter instead of using the public final is that it makes it harder to introduce subclasses later. I still want to do it in some cases but it's causes more friction with other developers and style-checkers than it's worth.
I like final and immutable objects too, but I notice that Java Persistence can't persist such objects directly. Apart from a handful of well known immutable classes (e.g. Integer), persistable entities have to have setters, getters and a no argument constructor.
For Hibernate, the field can't be officially final but I'll add a comment that it is "conceptually final" and comment the private setters "for Hibernate only".
I've been using XStream a lot lately for "persistence" (of fairly simple objects), and it works fine with final fields. You don't even need a default constructor. There's some weird mojo going on, it's scares me a bit but seems to work.
> For Hibernate, the field can't be officially final but > I'll add a comment that it is "conceptually final" and > comment the private setters "for Hibernate only". > > I've been using XStream a lot lately for "persistence" (of > fairly simple objects), and it works fine with final > fields. You don't even need a default constructor. > There's some weird mojo going on, it's scares me a bit > t but seems to work.
I was going to point this out. If XStream can do it, it should also be possible for other libraries.
> That's correct but I always get the answer that the call > to toString() will wait "because the object is locked." I > ask this specific question because this misconception is > so common. That would be true if toString() itself was synchronized, no?
Regarding visualizing concurrency, I wanted to say that java.util.concurrent package has a lot of useful methods to control lock/wait behavior using Locks and Conditions. ReentrantLock, an implementation of Lock interface has methods to display who owns the lock and who is waiting etc.
> > That's correct but I always get the answer that the > call > > to toString() will wait "because the object is locked." > I > > ask this specific question because this misconception > is > > so common. > That would be true if toString() itself was synchronized, > no?
If toString() synchronized on this, yes. However it does not. I think it's reasonable to expect a seasoned Java developer to know that toString() is not synchronized and if there is any doubt, I wouldn't count it against the candidate if they asked.
> Regarding visualizing concurrency, I wanted to say that > java.util.concurrent package has a lot of useful methods > to control lock/wait behavior using Locks and > Conditions. ReentrantLock, an > implementation of Lock interface has methods > to display who owns the lock and who is waiting etc.
I think the concurrent library is great but if a developer can't get handle on basic Java concurrency, it will probably just cause more confusion.
I think the estimate of 10% is on target. Java has great tool support, but the tools need to be used. My trio of underused tools are:
1) repeated thread dumps- the quickest way to answer the question, "what is really happening at run-time?" yet so often I encounter complete ignorance or optimistic, untested misunderstandings about an application's run-time behavior 2.) Samurai is a gui tool to visualize threa dumps 3.) Borland's OptimizeIt has a thread debugger that will detect locking problems and display them visually. It is a little old but still a valuable tolol.
> I think the estimate of 10% is on target. Java has great > tool support, but the tools need to be used. My trio of > underused tools are: > > 1) repeated thread dumps- the quickest way to answer the > question, "what is really happening at run-time?" yet so > often I encounter complete ignorance or optimistic, > untested misunderstandings about an application's run-time > behavior > 2.) Samurai is a gui tool to visualize threa dumps > 3.) Borland's OptimizeIt has a thread debugger that will > detect locking problems and display them visually. It is a > little old but still a valuable tool.
I've found that using log4J with the thread and timestamp turned on is pretty useful for debugging threading issues without affecting the timing too much especially when I am getting close to the issue. I'll check out Samurai next time I run into a problem.
I tend to agree with everything you are saying. Interview responses regarding concurrency lines of question can be remarkable in their inaccuracy.
I would say that people tend to know the keyword synchronized{} and they sometimes look at their code and say, "I should protect this." That's the only point I am making. (I am definitely not arguing or disputing that concurrency is usually done incorrectly.)
The point is tools can help: 1. Terracotta will not allow shared objects to be written to without a synchronizer. You have to hold _some_ lock _somewhere_. 2. Visualization helps find deadlocks and hotspots in concurrent code, as the blog demonstrated. 3. Eventually, Terracotta will be able to provide a mechanism for visualizing different ordering on lock acqusition and / or editing the same object while holding different locks (classic race conditions, etc.)
I never heard of Samurai. I have to go check out Samurai ASAP. Always interested in learning.
i think i understand where you-all are coming from. but i'd like to throw in some words of caution. communication between people is hard, so beware assuming that (a) the other person understands what you are asking and that (b) you understand what they are saying. to wit:
1) i hope you are writing your question on a whiteboard as code because i parsed the English a different way from what i now think i see you meant. so it ends up being more about "can we communicate" than "do you know concurrency in java?"
2) just because Strings are immutable doesn't mean s can't be reassigned leading to a race condition. that may or may not be relevant, it depends on context.
3) similarly, just because something is "final" in java doesn't make it truly immutable in the C++ "const" sense. again, it depends on context.
4) finally, my $0.02 is that shared-mutable-state-concurrency is just plain freaking wrong [unless you have a very very very constrained set of locks e.g. if you have 99% purely functional things with a teeny bit of locking on one or two places, rather than if you have some crappy-by-definition imperative code with lots of locks and threads] and we should all be using other things like the Actor model or dataflow etc.
> > I second the lack of universal understanding of the > > contracts of the concurrent primitives. > > > > My answer to the question is: > > > > If thread 2 does not contain synchronization on s then > the > > JVM will allow toString (or anything else) > to > > proceed while thread 1 is inside a block synchronized > on > > s. > > > > And this is likely to be safe since strings are > > immutable. > > That's correct but I always get the answer that the call > to toString() will wait "because the object is locked." I > ask this specific question because this misconception is > so common. > > But I think the construct synchronized(s) is a little > flawed because it suggests that s "is being synchronized" > when is fact there is no such feature in Java. > > At this point, I just try to avoid designs that will > require synchronization and when it can't be avoided, I > keep it as simple as possible.
Flat View: This topic has 19 replies
on 2 pages
[
12
|
»
]