Sunday, June 27, 2010

Resolving Hard-to-Catch Multi-threading and Concurrency Issues

Locking, Memory Barriers and Atomicity
When talking about multithreading, developers usually focus on one part of the equation which is locking or pre-emption of multiple threads. What they most of the time overlook is memory fencing and atomicity.

Know when to think and apply thread safety?
Basically, anytime the same variable or resource can be read or written to from different threads. In that case, you should review the code for three areas: Pre-emption, Memory Fencing and Atomcity. Then, apply the correct mechanism in a minimalistic way instead of wrapping a lock around everything and increasing the chances of deadlocks.

For example, if a 32 bit int is read in one place and written in another place by different threads, then concurrency is not an issue on today's processors because the int is is 32 bits and can be read by the processor in one pass. No lock statement is necessary, it is already atomic.

However, the issue that is mostly overlooked is that you still have to deal with the fact that the other thread might not see the up-to-date last value of that variable. This is because some processors try to optimize and cache values for threads from the same processor. This int must be "flushed" before the other threads can see its latest value.

To fix this, create a full memory fence by declaring the variable as volatile (in c#)to prevent the processor from optimizing and caching.

As far as pre-emption is concerned, don't bother about it here as it doesn't apply. This is because the variable is not read and written by two threads in the same place.

In summary, whenever you have a variable that can be read and written by multiple threads, analyze it for all three things: pre-emption, memory fencing and atomicity.

Memory Fence
For multithreading to be robust, a developer must handle pre-emption and memory fencing. Pre-emption deals with race conditions and deadlocks. Memory fencing deals with flushing memory from the cache to the heap. This is necessary because many processors optimize and cache variables in the cpu registers and making sure this optimizations is transparent to the current thread, but not necessarily the other threads.

For example, if you store int x = 5, the processor might not write it right away to the main memory, making it invisible to other threads. What is needed is flushing or forcing the writing of x to the main memory. In the .NET platform, you can accomplish flushing using the Threading.MemoryBarrier() or the volatile key word.

The volatile keyword basically tells the processor to not cache or re-order the variable. It will be current all the time for OTHER threads. Remember that for the current thread, this doesn't matter because the processor guarantees that no matter what caching or optimization happens, the current thread doesn't get affected.

Some Types Are Treated Atomically
It is important to also know that some variables are read and written atomically. That depends on the CPU bus size, or how many bytes can the CPU read/write in one pass. For example, on 32bit machines, any variable size of 32 bytes is read in one pass. No need to lock it. But if it is a 64 bit variable, then it is read in two passes and a lock is needed to avoid the possibility that another thread might read or write the other 32 bit seqment of that 64 bit variable and getting weird values. A very hard to catch bug. On the other side, on a 64 bit machine, a 64 bit variable is atomic because the processor reads and writes it one pass.

Locking Ways
On the .NET platform, different locking mechanisms exist for every level. The highest ones do all three areas I mentioned above of locking (concurrency (preemption), memory fencing and atomicity). Others handle one area only.

1. The lock, monitor, AutoResetEvent and ManualResetEvent, Mutex and Semophore all implement all three areas of multithreading.

2. The Interlock handles preemption and memory fencing for quick operation like increment, add, assignment

3. The volatile key words and MemoryBarrier() handle fencing only making sure that memory is flushed from the processor to the main memory so the other threads see the updates. No locking or waiting.


How to use MemoryBarrier() in Microsoft .NET?
Basically, MemoryBarrier() is more granular than volatile. To create a read memory fence, then call it BEFORE the read. To flush memory to main when writing to a variable, call MemoryBarrier() AFTER the assignment.