[zeromq-dev] are zmq::atomic_ptr_t<> Helgrind warnings known?

Francesco francesco.montorsi at gmail.com
Sun Feb 25 10:54:01 CET 2018


Hi Bill,
thanks for your answer.



2018-02-24 21:49 GMT+01:00 Bill Torpey <wallstprog at gmail.com>:
>
> ...
>
If T2 and T4 were application code, this would be a clear violation of
> ZMQ’s threading rules (assuming “legacy”, non-thread-safe socket types).
>
Right.



> For instance, one technique would be to perform initialization (e.g.,
> bind’s) in the main thread, and only after that is done spin up other
> threads to do processing.  In this case, TSAN wouldn't have any way to know
> that the application guarantees that accesses cannot result in a race, so
> TSAN would flag it.  I’ve gotten in the habit of using mutexes to protect
> code like this even though it should not strictly be needed, just to keep
> tools like TSAN happy, and also because I don’t know the ZMQ code well
> enough to be 100% certain that the mutexes are not necessary — better safe
> than sorry!
>
Yeah, that's a possibility but it results in a lot of "clutter" that
decrease code readability and makes it harder to maintain in the long run...

This situation is different, though, since T4 is not an application thread
> — it’s an internal ZMQ worker thread.   So, I think in this case we kind of
> have to accept that ZMQ is doing the right thing here.
> At least, that’s the approach I’ve been taking.  When I instrument my apps
> and libraries with TSAN I specifically do NOT instrument ZMQ, and I also
> use the “called_from_lib:libzmq.so” suppression (which is listed as an
> example for TSAN: https://github.com/google/sanitizers/wiki/
> ThreadSanitizerSuppressions).
>
Understood. I have a question though: if you use
“called_from_lib:libzmq.so”  suppression, are you able to spot the race
condition due to T2 and T4 being application threads (instead of being 1
application and 1 zmq internal)?
I wonder if TSAN, detecting that one of the 2 threads generating the data
race is inside ZMQ, entirely suppress the race warning or instead will
suppress only race conditions involving 2 internal zmq threads..


Instrumenting libzmq and/or omitting the suppression causes a LOT of
> warnings, esp. in the ZMQ worker threads.  So, unless I'm willing to commit
> the time and effort to go through and investigate each of these warnings, I
> feel I have little choice but to accept that at this point in its lifetime
> ZMQ should be race-free for all practical purposes.
>
> FWIW, I’ve done fairly extensive testing, and specifically stress testing,
> and have yet to find anything that looks like an honest-to-goodness bug in
> ZMQ.  (Which is not to say that the docs are always clear about what to
> expect in certain situations ;-)  I did have one problem which appears to
> have been a bug in epoll, and which was resolved by upgrading Linux, but
> that’s it.
>

I agree, I was just surprised to see so many warnings...



> BTW, there’s an excellent overview of how this all works at
> http://zeromq.org/whitepapers:architecture — although it’s somewhat old,
> it appears to still be relatively accurate.
>

thanks for the link, I read that and it's quite interesting. However it
does not mention how multi-thread safety is achieved.
Just out of curiosity I will take a look at ypipe implementation.



> I tried forcing use of C++11 internal atomics and all the warnings
> about  zmq::atomic_ptr_t<> disappear (I still get tons about ypipe and
> "pipe")..
> maybe it's best to prefer C++11 atomics when available?
>
>
> I would actually suggest using plain old pthread for synchronization
> primitives — TSAN’s support for pthread is pretty mature. Support for C++11
> is a bit less mature, and has certain constraints (
> https://clang.llvm.org/docs/ThreadSanitizer.html):
>

However using old pthread for synchronization you probably loose a lot in
performances... if C++11 atomics are supported by TSAN just fine, why not
preferring them over the GCC intrinsics?



> *ThreadSanitizer is in beta stage. It is known to work on large C++
> programs using pthreads, but we do not promise anything (yet). C++11
> threading is supported with llvm libc++. The test suite is integrated into
> CMake build and can be run with make check-tsan command.*
>
>
> In particular, to use C++11 you also need to use llvm’s libcxx, rather
> than gnu’s libstdc++.
>

I think that the quoted text is referring to "C++11 threading" that is to
http://en.cppreference.com/w/cpp/thread
As I mentioned from a simple test I do see that at least gcc 5.3.0 TSAN is
handling C++11 atomics fine (without forcing you to link LLVM libc++)




> Btw I will try to write a suppression file for ThreadSanitizer and ZMQ...
> I just start to doubt: is all ZMQ code really race-free? :)
>
>
> As mentioned above, that is likely to be a huge undertaking.  At this
> point, I think the only confirmation we have that ZMQ is race-free is based
> on “black-box” testing.  The good news is that ZMQ appears to be used
> widely enough that the code is well-tested.  (Of course, failure to find a
> bug doesn’t prove that there isn’t one, so having a rigorous analysis of
> ZMQ’s threading behavior would be A Good Thing — but a big job).
>

Yeah, I agree!


Thanks,
Francesco
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20180225/57342a88/attachment.htm>


More information about the zeromq-dev mailing list