Uninformed: Informative Information for the Uninformed

Vol 7» 2007.May


Thread Data Consistency

Programmers familiar with the pains of thread deadlocks and thread-related memory corruption should be well aware of how tedious these problems can be to debug. By analyzing memory access behavior in conjunction with some additional variables, it may be possible to make determinations as to whether or not a memory operation is being made in a thread safe manner. At this point, the author has not defined a formal approach that could be taken to achieve this, but a few rough ideas have been identified.

The basic idea behind this approach would be to combine memory access behavior with information about the thread that the access occurred in and the set of locks that were acquired when the memory access occurred. Determining which locks are held can be as simple as inserting instrumentation code into the routines that are used to acquire and release locks at runtime. When a lock is acquired, it can be pushed onto a thread-specific stack. When the lock is released, it can be removed. The nice thing about representing locks as a stack is that in almost every situation, locks should be acquired and released in symmetric order. Acquiring and releasing locks asymmetrically can quickly lead to deadlocks and therefore can be flagged as problematic.

Determining data consistency is quite a bit trickier, however. An analysis library would need some means of historically tracking read and write access to different locations in memory. Still, determining what might be a data consistency issue from this historical data is challenging. One example of a potential data consistency issue might be if two writes occur to a location in memory from separate threads without a common lock being acquired between the two threads. This isn't guaranteed to be problematic, but it is at the very least be indicative of a potential problem. Indeed, it's likely that many other types of data consistency examples exist that may be possible to capture in relation to memory access, thread context, and lock ownership.

Even if this concept can be made to work, the very fact that it would be a runtime solution isn't a great thing. It may be the case that code paths that lead to thread deadlocks or thread-related corruption are only executed rarely and are hard to coax out. Regardless, the author feels like this represents an interesting area of future research.