Software Testing Blog

Reordering optimizations

In my previous article on ATBG I said that without a lock, the read of a field can be moved arbitrarily far backwards in time on a thread, and that this can cause unexpected behaviour in a program. It is perhaps not clear why this can be a problem; it seems like a read shouldn’t have any side effects that can be noticed if it happens earlier. A number of readers asked whether making the read into a volatile read prevented the reordering problem.

The answer to the general question of whether a volatile read can be re-ordered is yes, yes it can. Though a volatile read in a loop cannot be cached once and elided, a volatile read can be moved backwards in time with respect to a volatile write.

The answer to the more specific question of whether making the field volatile makes the program I was given correct cannot really be answered because that was an oversimplified toy program. Deducing the correctness of a real program by analyzing a toy program is perhaps a bad idea.

Instead, I’ll give you an example that (1) can actually happen even on the more restrictive memory model imposed by x86 hardware, and (2) heck let’s make everything volatile while we’re at it; it won’t help! We’ll elide a bunch of locks and see what goes wrong. Check out this program fragment:

static volatile bool q = false;
static volatile bool r = false;
static volatile bool s = false;
static volatile bool t = false;
static object locker = new object();

static bool GetR() { return r; }  // No lock!
static void SetR() { lock(locker) { r = true; } }

static void MethodOne()
{
  q = true;
  if (!GetR())
    s = true;
}

static void MethodTwo()
{
  SetR();
  if (!q)
    t = true;
}

The rest of the program, which I have not shown, behaves as follows. First, the static initializers run normally, so the four Booleans and the monitor are all created and assigned to their given values. Then the program creates two threads. Thread one runs MethodOne and thread two runs MethodTwo. Both threads run to completion and these are the only invocations of the two methods. The question is: can the original thread now observe s and t to both be true?

Give that some thought before you read on.

.
.
.
.
.
.

It would appear that the answer is no, by the following logic.

  • If one thread runs q=true; before the other runs if(!q) then t remains false.
  • Otherwise, one thread runs q=true; after the other runs if(!q). Clearly that implies that if(!GetR()) must have run after SetR();. That implies that !GetR() is false. Therefore s remains false.
  • There are only these two possible cases and in each at least one of s and t remains false, therefore it is impossible that both become true.

This logic is wrong; specifically the second statement is completely bogus and therefore the conclusion does not follow. As is often the case, the bit marked “clearly” is the flag that indicates the error. The lack of a lock on the read of r means that the CPU may re-order the read by moving it arbitrarily far backwards in time even with respect to volatile writes, and an x86 will occasionally do so if the stars align correctly! An x86 will not reorder two reads with respect to each other and will not reorder two writes with respect to each other, but it has no problem reordering a read of one variable with respect to a write of another. The CPU is permitted to move the read backwards in time by pretending that you actually wrote:

static void MethodOne()
{
  bool temp = r;
  q = true;
  if (!temp)
    s = true;
}

Since this optimization is invisible to the code on the current thread, it is legal. But now there is an obvious interleaving in which both s and t become true. First we assign false to temp on one thread, then all of MethodTwo runs on the other thread, so t is true, and then the remainder of MethodOne sets s to true.

Now imagine that your program depends on s and t never being both true, and this situation happens let’s say one time out of every billion executions of this code. If it runs a thousand times a day on a thousand machines that’s one failure in every three years. How on earth would you debug it? You’d be likely to suspect hardware failure rather than a bug in the program, but there assuredly is a bug in the program. These are the nightmare scenarios you run into when you think you’re clever enough to elide locks. The moment you depart even slightly from one of the “blessed” low-lock patterns you are off in the weeds.

The right thing to do here is first, if you can avoid the shared memory situation in the first place, do so. If you cannot, lock everything all the time. In this particular case, locking the read of r or the write of q would likely be sufficient to ensure that s and t are never both true, but I discourage the attitude that walking as close as possible to the edge of a cliff is a great way to be safe. Walk as far away from that cliff as you possibly can! If you’ve got shared memory then put a lock around all usages of it. Simply making a shared variable volatile is insufficient to ensure thread safety in all cases.


Thanks to threading expert Joe Duffy for bringing this scenario to my attention many years ago.


As always, if you have questions about a bug you’ve found in a C, C++, C# or Java program that you think would make a good episode of ATBG, please send your question along with a small reproducer of the problem to TheBugGuys@Coverity.com. We cannot promise to answer every question or solve every problem, but we’ll take a selection of the best questions that we can answer and address them on the dev testing blog every couple of weeks.

  1. I’ve yet to see anything resembling a real-world scenario where use of “volatile” is appropriate. Might be a good topic for another article…

    1. If you happen to be writing a bootloader (or some such) and need to read to or write from a memory-mapped register.

  2. MSDN says: “The volatile modifier is usually used for a field that is accessed by multiple threads without using the lock statement to serialize access.”

    Given that you have (a) cautioned against doing multithreaded access without locks, and (b) shown a multithreaded access problem on volatile fields, it sounds like your article could be summarized thus: “volatile is a pretty useless keyword.”

  3. As far as I can see, all the nastiness of volatile only comes from having *multiple* volatile fields and depending on the order in which reads and writes are done to these fields for the program to be correct.

    However for the purposes of just having say a volatile bool that is only ever written to from a thread and only ever read from in a loop in another thread and you want to stop the read from being cached in a register, it seems safe enough to use for me.

    What do you think ?

    1. “Safe enough for you” and “safe enough for me” might be two different things. I would want a *proof* that the read of the volatile bool moving backwards in time with respect to any write on the thread could *never* cause the program to be in error. Maybe your standards are different and you like living on the edge more than I do.

      Note also that of course this proof has to be performed *on every change to the program for the rest of time*. Who knows what change could cause the analysis of the correctness of the program to change. This seems expensive.

      But the question is of course then: why are you checking a bool field in a multithreaded program in the first place? If the reason is “to see if anyone has cancelled the work associated with this thread” then use a cancellation token, not a volatile bool. Use the highest-level tool at your disposal, not the lowest-level.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Current day month ye@r *