[zeromq-dev] Race conditions again

Michi Henning michi at triodia.com
Tue Jan 7 07:21:32 CET 2014


Hi Pieter,


> I've never used thread sanitizer.

I highly recommend it. If you grab clang 3.3 and configure as I showed
in my previous mail and run the tests, you will see *lots* of race conditions.
I believe that these are real. (Clang is pretty good at not producing false positives.)
Some of the races are potentially serious, such as calling free() in one thread
on a pointer returned from malloc() in another thread without an intervening lock.

> Two things strike me. First, if
> there are *real* race conditions, you can help by tracking these down
> and working with us to fix them.

I can try and help, but I'm totally unfamiliar with the zmq code base, so this
probably will be quite slow for me.

> Secondly, if these are false
> positives, there must be some "ignore" file, as we use for valgrind.

Yes, there is an ignore file. Unfortunately, it doesn't offer the fine-grained
control that valgrind provides. You can only specify the topmost stack frame,
not a whole section of stack (as valgrind permits), so the only way to disable
a race condition from a call to free() is to suppress errors for all threads calling
free(), whether they are my own or zmq's.

> As for race conditions in your own code, if you do not use shared
> mutable state, and only pass data between threads via messages, there
> is no opportunity for race conditions in your own code.

There are parts of the code that do things via messages between threads,
but there are other parts that do things with threads the old-fashioned way.
In particular, my code is part of a library that loads modules at run time from
shared libraries. The code in those shared libraries is provided by third parties
and can create its own threads and do what it likes with them. So, basically,
I'm not in full control of all the threads and what they do. That's why it's so
important to get clean thread sanitizer output.

But aside from what I would like, would you mind at least building with clang
and running the tests? It might turn out to be something of an eye-opener.
I'm pretty sure that most of the race conditions reported by clang in zmq
are real.

Cheers,

Michi.


More information about the zeromq-dev mailing list