[zeromq-dev] are zmq::atomic_ptr_t<> Helgrind warnings known?

Francesco francesco.montorsi at gmail.com
Sat Feb 24 17:34:21 CET 2018


Actually even building zeromq with -fsanitize=thread I still get a lot of
data races about ZMQ atomics:

WARNING: ThreadSanitizer: data race (pid=25770)
  Atomic write of size 4 at 0x7d4000017e48 by thread T4:
    #0 __tsan_atomic32_fetch_add
../../../../gcc-5.3.0/libsanitizer/tsan/tsan_interface_atomic.cc:611
(libtsan.so.0+0x00000005860f)
    #1 zmq::atomic_counter_t::add(unsigned int) src/atomic_counter.hpp:116
(libzmq.so.5+0x00000001a087)
    #2 zmq::poller_base_t::adjust_load(int) src/poller_base.cpp:53
(libzmq.so.5+0x00000006d20d)
    #3 zmq::epoll_t::add_fd(int, zmq::i_poll_events*) src/epoll.cpp:91
(libzmq.so.5+0x00000003c229)
    #4 zmq::io_object_t::add_fd(int) src/io_object.cpp:66
(libzmq.so.5+0x0000000400ef)
    #5 zmq::tcp_connecter_t::start_connecting() src/tcp_connecter.cpp:204
(libzmq.so.5+0x0000000abf30)
    #6 zmq::tcp_connecter_t::process_plug() src/tcp_connecter.cpp:94
(libzmq.so.5+0x0000000ab6cb)
    #7 zmq::object_t::process_command(zmq::command_t&) src/object.cpp:90
(libzmq.so.5+0x000000059e37)
    #8 zmq::io_thread_t::in_event() src/io_thread.cpp:85
(libzmq.so.5+0x000000040cb3)
    #9 zmq::epoll_t::loop() src/epoll.cpp:188 (libzmq.so.5+0x00000003ce6d)
    #10 zmq::epoll_t::worker_routine(void*) src/epoll.cpp:203
(libzmq.so.5+0x00000003cfb6)
    #11 thread_routine src/thread.cpp:109 (libzmq.so.5+0x0000000af00b)

  Previous read of size 4 at 0x7d4000017e48 by thread T2:
    #0 zmq::atomic_counter_t::get() const src/atomic_counter.hpp:210
(libzmq.so.5+0x000000061d20)
    #1 zmq::poller_base_t::get_load() src/poller_base.cpp:47
(libzmq.so.5+0x00000006d1c6)
    #2 zmq::io_thread_t::get_load() src/io_thread.cpp:72
(libzmq.so.5+0x000000040bef)
    #3 zmq::ctx_t::choose_io_thread(unsigned long) src/ctx.cpp:451
(libzmq.so.5+0x0000000184ff)
    #4 zmq::object_t::choose_io_thread(unsigned long) src/object.cpp:199
(libzmq.so.5+0x00000005a67f)
    #5 zmq::socket_base_t::bind(char const*) src/socket_base.cpp:610
(libzmq.so.5+0x00000008bca6)
    #6 zmq_bind src/zmq.cpp:326 (libzmq.so.5+0x0000000c8735)
   ....

or about ypipe:

WARNING: ThreadSanitizer: data race (pid=25770)
  Read of size 8 at 0x7d640001e0c0 by thread T4:
    #0 zmq::ypipe_t<zmq::command_t, 16>::read(zmq::command_t*)
src/ypipe.hpp:170 (libzmq.so.5+0x000000047e49)
    #1 zmq::mailbox_t::recv(zmq::command_t*, int) src/mailbox.cpp:73
(libzmq.so.5+0x000000047793)
    #2 zmq::io_thread_t::in_event() src/io_thread.cpp:86
(libzmq.so.5+0x000000040cd5)
    #3 zmq::epoll_t::loop() src/epoll.cpp:188 (libzmq.so.5+0x00000003ce6d)
    #4 zmq::epoll_t::worker_routine(void*) src/epoll.cpp:203
(libzmq.so.5+0x00000003cfb6)
    #5 thread_routine src/thread.cpp:109 (libzmq.so.5+0x0000000af00b)

  Previous write of size 8 at 0x7d640001e0c0 by thread T2 (mutexes: write
M500):
    #0 zmq::ypipe_t<zmq::command_t, 16>::write(zmq::command_t const&, bool)
src/ypipe.hpp:85 (libzmq.so.5+0x000000047f05)
    #1 zmq::mailbox_t::send(zmq::command_t const&) src/mailbox.cpp:62
(libzmq.so.5+0x0000000476dc)
    #2 zmq::ctx_t::send_command(unsigned int, zmq::command_t const&)
src/ctx.cpp:438 (libzmq.so.5+0x000000018420)
    #3 zmq::object_t::send_command(zmq::command_t&) src/object.cpp:474
(libzmq.so.5+0x00000005c07b)
    #4 zmq::object_t::send_plug(zmq::own_t*, bool) src/object.cpp:220
(libzmq.so.5+0x00000005a7ef)
    #5 zmq::own_t::launch_child(zmq::own_t*) src/own.cpp:87
(libzmq.so.5+0x00000006134f)
    #6 zmq::socket_base_t::add_endpoint(char const*, zmq::own_t*,
zmq::pipe_t*) src/socket_base.cpp:1006 (libzmq.so.5+0x00000008e081)
    #7 zmq::socket_base_t::bind(char const*) src/socket_base.cpp:630
(libzmq.so.5+0x00000008be89)
    #8 zmq_bind src/zmq.cpp:326 (libzmq.so.5+0x0000000c8735)
    ...


I tried forcing use of C++11 internal atomics and all the warnings
about  zmq::atomic_ptr_t<> disappear (I still get tons about ypipe and
"pipe")..
maybe it's best to prefer C++11 atomics when available?

Btw I will try to write a suppression file for ThreadSanitizer and ZMQ... I
just start to doubt: is all ZMQ code really race-free? :)


Thanks,
Francesco




2018-02-24 16:44 GMT+01:00 Francesco <francesco.montorsi at gmail.com>:

> Hi Bill,
> indeed I've found ThreadSanitizer to be more effective i.e., it contains
> much less false positives compared to Helgrind... still I'm getting several
> race condition warnings out of libzmq 4.2.3... indeed I actually built only
> my application code with -fsanitize=thread... do you know if I need to
> rebuild libzmq with that option as well? I will try to see if it makes any
> difference!
>
> Thanks,
> Francesco
>
>
>
>
> 2018-02-24 15:17 GMT+01:00 Bill Torpey <wallstprog at gmail.com>:
>
>> I’m using clang’s Thread Sanitizer for a similar purpose, and just
>> happened to notice that the TSAN docs use ZeroMQ as one of the example
>> suppressions:  https://github.com/google/san
>> itizers/wiki/ThreadSanitizerSuppressions
>>
>> I assume that the reason for suppressing libzmq.so has to do with
>> (legacy) sockets not being thread-safe, so the code may exhibit race
>> conditions that are irrelevant given that the code is not intended to be
>> called from multiple threads.
>>
>> FWIW, you may want to check out the clang sanitizers — they have certain
>> advantages over valgrind — (faster, multi-threaded, etc.) if you are able
>> to instrument the code at build time.
>>
>>
>> On Feb 23, 2018, at 6:52 AM, Luca Boccassi <luca.boccassi at gmail.com>
>> wrote:
>>
>> On Fri, 2018-02-23 at 12:22 +0100, Francesco wrote:
>>
>> Hi all,
>> I'm trying to further debug the problem I described in my earlier
>> mail (
>> https://lists.zeromq.org/pipermail/zeromq-dev/2018-February/032303.ht
>> ml) so
>> I decided to use Helgrind to find race conditions in my code.
>>
>> My problem is that apparently Helgrind 3.12.0 is reporting race
>> conditions
>> against zmq::atomic_ptr_t<> implementation.
>> Now I know that Helgrind has troubles with C++11 atomics but by
>> looking at
>> the code I see that ZMQ is not using them (note: I do have
>> ZMQ_ATOMIC_PTR_CXX11 defined but I also have ZMQ_ATOMIC_PTR_INTRINSIC
>> defined, so the latter wins!).
>>
>> In particular Helgrind 3.12.0 tells me that:
>>
>>
>> ==00:00:00:11.885 29399==
>> ==00:00:00:11.885 29399== *Possible data race during read of size 8
>> at
>> 0xB373BF0 by thread #4*
>> ==00:00:00:11.885 29399== Locks held: none
>> ==00:00:00:11.885 29399==    at 0x6BD79AB:
>> *zmq::atomic_ptr_t<zmq::command_t>::cas*(zmq::command_t*,
>> zmq::command_t*)
>> (atomic_ptr.hpp:150)
>> ==00:00:00:11.885 29399==    by 0x6BD7874:
>> zmq::ypipe_t<zmq::command_t,
>> 16>::check_read() (ypipe.hpp:147)
>> ==00:00:00:11.885 29399==    by 0x6BD7288:
>> zmq::ypipe_t<zmq::command_t,
>> 16>::read(zmq::command_t*) (ypipe.hpp:165)
>> ==00:00:00:11.885 29399==    by 0x6BD6FE7:
>> zmq::mailbox_t::recv(zmq::command_t*, int) (mailbox.cpp:98)
>> ==00:00:00:11.885 29399==    by 0x6BD29FC:
>> zmq::io_thread_t::in_event()
>> (io_thread.cpp:81)
>> ==00:00:00:11.885 29399==    by 0x6BD05C1: zmq::epoll_t::loop()
>> (epoll.cpp:188)
>> ==00:00:00:11.885 29399==    by 0x6BD06C3:
>> zmq::epoll_t::worker_routine(void*) (epoll.cpp:203)
>> ==00:00:00:11.885 29399==    by 0x6C18BA5: thread_routine
>> (thread.cpp:109)
>> ==00:00:00:11.885 29399==    by 0x4C2F837: mythread_wrapper
>> (hg_intercepts.c:389)
>> ==00:00:00:11.885 29399==    by 0x6E72463: start_thread
>> (pthread_create.c:334)
>> ==00:00:00:11.885 29399==    by 0x92F901C: clone (clone.S:109)
>> ==00:00:00:11.885 29399==
>> ==00:00:00:11.885 29399== This conflicts with a previous write of
>> size 8 by
>> thread #2
>> ==00:00:00:11.885 29399== Locks held: 1, at address 0xB373C08
>> ==00:00:00:11.885 29399==    at 0x6BD77F4:
>> *zmq::atomic_ptr_t<zmq::command_t>::set*(zmq::command_t*)
>> (atomic_ptr.hpp:90)
>> ==00:00:00:11.885 29399==    by 0x6BD7422:
>> zmq::ypipe_t<zmq::command_t,
>> 16>::flush() (ypipe.hpp:125)
>> ==00:00:00:11.885 29399==    by 0x6BD6DF5:
>> zmq::mailbox_t::send(zmq::command_t const&) (mailbox.cpp:63)
>> ==00:00:00:11.885 29399==    by 0x6BB9128:
>> zmq::ctx_t::send_command(unsigned int, zmq::command_t const&)
>> (ctx.cpp:438)
>> ==00:00:00:11.885 29399==    by 0x6BE34CE:
>> zmq::object_t::send_command(zmq::command_t&) (object.cpp:474)
>> ==00:00:00:11.885 29399==    by 0x6BE26F8:
>> zmq::object_t::send_plug(zmq::own_t*, bool) (object.cpp:220)
>> ==00:00:00:11.885 29399==    by 0x6BE68E2:
>> zmq::own_t::launch_child(zmq::own_t*) (own.cpp:87)
>> ==00:00:00:11.885 29399==    by 0x6C03D6C:
>> zmq::socket_base_t::add_endpoint(char const*, zmq::own_t*,
>> zmq::pipe_t*)
>> (socket_base.cpp:1006)
>> ==00:00:00:11.885 29399==  Address 0xb373bf0 is 128 bytes inside a
>> block of
>> size 224 alloc'd
>> ==00:00:00:11.885 29399==    at 0x4C2A6FD: operator new(unsigned
>> long,
>> std::nothrow_t const&) (vg_replace_malloc.c:376)
>> ==00:00:00:11.885 29399==    by 0x6BB8B8D:
>> zmq::ctx_t::create_socket(int)
>> (ctx.cpp:351)
>> ==00:00:00:11.885 29399==    by 0x6C284D5: zmq_socket (zmq.cpp:267)
>> ==00:00:00:11.885 29399==    by 0x6143809:
>> ZmqClientSocket::Config(PubSubSocketConfig const&)
>> (ZmqRequestReply.cpp:303)
>> ==00:00:00:11.885 29399==    by 0x6144069:
>> ZmqClientMultiSocket::Config(PubSubSocketConfig const&)
>> (ZmqRequestReply.cpp:407)
>> ==00:00:00:11.885 29399==    by 0x61684EF: client_thread_main(void*)
>> (ZmqRequestReplyUnitTests.cpp:132)
>> ==00:00:00:11.886 29399==    by 0x4C2F837: mythread_wrapper
>> (hg_intercepts.c:389)
>> ==00:00:00:11.886 29399==    by 0x6E72463: start_thread
>> (pthread_create.c:334)
>> ==00:00:00:11.886 29399==    by 0x92F901C: clone (clone.S:109)
>> ==00:00:00:11.886 29399==  Block was alloc'd by thread #2
>>
>>
>> Is this a known (and ignorable) issue with  zmq::atomic_ptr_t<>?
>>
>> Thanks,
>> Francesco
>>
>>
>> Yeah I started trying to put together a suppression file but never
>> finished it:
>>
>> https://github.com/bluca/libzmq/commit/fb9ee9da7631f9506cbfc
>> d6db29a284ae6e9651e
>>
>> Hope to have time to finish working on it eventually (feel free to
>> contribute!) as it's very noisy right now, as it can't know about our
>> lock-free queue implementation without the custom suppression file
>>
>> --
>> Kind regards,
>> Luca Boccassi_______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20180224/fc5c9bf7/attachment.htm>


More information about the zeromq-dev mailing list