This post originated from an RSS feed registered with .NET Buzz
by Marcus Mac Innes.
Original Post: Synchronized .NET Collection Classes are NOT Thread Safe.
Feed Title: Marcus Mac Innes' Blog
Feed URL: http://www.styledesign.biz/weblogs/macinnesm/Rss.aspx
Feed Description: Issues relating to .NET, Service Oriented Architecture, SQL Server and other technologies.
Call me a purist, but when someone asks me whether something is thread safe, I like to re-phrase the question in my head to something like "Is this air traffic control system thread safe" before answering...Yes or No?
So you can imagine my surprise then after finally tracking down a threading issue only to discover that the bug was in fact caused by the .NET synchronized Hashtable and it turns out that the synchronized Hashtable is in fact not thread safe, but thread safe "ish" and to me that's NOT thread safe at all.
The documentation for Hashtable and other collection objects which implement System.Collections.ICollection indicates that the static method Synchronized "Returns a synchronized (thread-safe) wrapper for the Hashtable.". Creating one coundn't be simpler, a small change in the Hashtable's initialisation provides you with thread safety...
The Hashtable class contains a private class called SyncHashtable which simply subclasses Hashtable and overrides its properties and methods. In reality, these wrappers synchronize data access by simply wrapping access in a lock statement. Here is the Hashtable's synchronized Add method:
Well not exactly... You see the Add method above is thread safe. Only one thread can add to the Hashtable at a time. But if we look a little closer, we find that not all properties and methods provide lock safety. Here is the implementation of the indexer which you will notice is missing a lock on the "get":
public override object this[object key] { get { return this._table[key]; } set { lock (this._table.SyncRoot) { this._table[key] = value; } } }
This is a typical example of naive thread safety. The assumption goes that mutliple threads can simultaneously read from an object safely, but only one thread at a time can write to the object safely. And this is the implementation we see in System.Collection.Hashtable (above).
This design works perfectly well and provides optimum performance since we are not holding up threads who only wish to read an objects data, they simply work away oblivious to each other, only when a thread wishes to write does it need to synchronize its access to the data. There is however one HUGE assumption in this design, the writes must be "atomic".
In some cases writes to memory locations are actually atomic. For instance setting a value of a bool or int on 32bit single processor machine in .NET 1.1 just happens to be atomic. We run into problems on multi processor machines however due to processor level memory caching and often need to qualify writes with System.Threading.Thread.MemoryBarrier() (discussed here) which flushes the processor's memory cache. But now we have 2 separate statements which together are definitely not atomic.
So why is this important when looking at the synchronization design of the System.Collections.Hashtable.SyncHashtable class?
Well if a thread always has the green light to perform a read, then it makes the assumption that the data to be read exists in a fully written state. If it were allowed to read "while" another thread was in the middle of write (albeit one thread writing at a time), then the data could be half written when the read is performed and hence we have a subtle and often difficult to locate threading bug.
Other methods on SyncHashtable that fall foul of the design flaw include: Contains, ContainsKey, ContainsValue, CopyTo, GetEnumerator, GetHash and KeyEquals.
Don't get caught out!
[Update:] As of .NET v2.0.40607 SyncHashtable.CopyTo has been upgraded to include a lock.