[zeromq-dev] are zmq::atomic_ptr_t<> Helgrind warnings known?

Luca Boccassi luca.boccassi at gmail.com
Sat Feb 24 23:20:28 CET 2018


There are a couple in the use/setting of a handful of socket options,
but nothing major. The pipes and queues are fine.

On Sat, 2018-02-24 at 17:34 +0100, Francesco wrote:
> Actually even building zeromq with -fsanitize=thread I still get a
> lot of
> data races about ZMQ atomics:
> 
> WARNING: ThreadSanitizer: data race (pid=25770)
>   Atomic write of size 4 at 0x7d4000017e48 by thread T4:
>     #0 __tsan_atomic32_fetch_add
> ../../../../gcc-5.3.0/libsanitizer/tsan/tsan_interface_atomic.cc:611
> (libtsan.so.0+0x00000005860f)
>     #1 zmq::atomic_counter_t::add(unsigned int)
> src/atomic_counter.hpp:116
> (libzmq.so.5+0x00000001a087)
>     #2 zmq::poller_base_t::adjust_load(int) src/poller_base.cpp:53
> (libzmq.so.5+0x00000006d20d)
>     #3 zmq::epoll_t::add_fd(int, zmq::i_poll_events*)
> src/epoll.cpp:91
> (libzmq.so.5+0x00000003c229)
>     #4 zmq::io_object_t::add_fd(int) src/io_object.cpp:66
> (libzmq.so.5+0x0000000400ef)
>     #5 zmq::tcp_connecter_t::start_connecting()
> src/tcp_connecter.cpp:204
> (libzmq.so.5+0x0000000abf30)
>     #6 zmq::tcp_connecter_t::process_plug() src/tcp_connecter.cpp:94
> (libzmq.so.5+0x0000000ab6cb)
>     #7 zmq::object_t::process_command(zmq::command_t&)
> src/object.cpp:90
> (libzmq.so.5+0x000000059e37)
>     #8 zmq::io_thread_t::in_event() src/io_thread.cpp:85
> (libzmq.so.5+0x000000040cb3)
>     #9 zmq::epoll_t::loop() src/epoll.cpp:188
> (libzmq.so.5+0x00000003ce6d)
>     #10 zmq::epoll_t::worker_routine(void*) src/epoll.cpp:203
> (libzmq.so.5+0x00000003cfb6)
>     #11 thread_routine src/thread.cpp:109
> (libzmq.so.5+0x0000000af00b)
> 
>   Previous read of size 4 at 0x7d4000017e48 by thread T2:
>     #0 zmq::atomic_counter_t::get() const src/atomic_counter.hpp:210
> (libzmq.so.5+0x000000061d20)
>     #1 zmq::poller_base_t::get_load() src/poller_base.cpp:47
> (libzmq.so.5+0x00000006d1c6)
>     #2 zmq::io_thread_t::get_load() src/io_thread.cpp:72
> (libzmq.so.5+0x000000040bef)
>     #3 zmq::ctx_t::choose_io_thread(unsigned long) src/ctx.cpp:451
> (libzmq.so.5+0x0000000184ff)
>     #4 zmq::object_t::choose_io_thread(unsigned long)
> src/object.cpp:199
> (libzmq.so.5+0x00000005a67f)
>     #5 zmq::socket_base_t::bind(char const*) src/socket_base.cpp:610
> (libzmq.so.5+0x00000008bca6)
>     #6 zmq_bind src/zmq.cpp:326 (libzmq.so.5+0x0000000c8735)
>    ....
> 
> or about ypipe:
> 
> WARNING: ThreadSanitizer: data race (pid=25770)
>   Read of size 8 at 0x7d640001e0c0 by thread T4:
>     #0 zmq::ypipe_t<zmq::command_t, 16>::read(zmq::command_t*)
> src/ypipe.hpp:170 (libzmq.so.5+0x000000047e49)
>     #1 zmq::mailbox_t::recv(zmq::command_t*, int) src/mailbox.cpp:73
> (libzmq.so.5+0x000000047793)
>     #2 zmq::io_thread_t::in_event() src/io_thread.cpp:86
> (libzmq.so.5+0x000000040cd5)
>     #3 zmq::epoll_t::loop() src/epoll.cpp:188
> (libzmq.so.5+0x00000003ce6d)
>     #4 zmq::epoll_t::worker_routine(void*) src/epoll.cpp:203
> (libzmq.so.5+0x00000003cfb6)
>     #5 thread_routine src/thread.cpp:109 (libzmq.so.5+0x0000000af00b)
> 
>   Previous write of size 8 at 0x7d640001e0c0 by thread T2 (mutexes:
> write
> M500):
>     #0 zmq::ypipe_t<zmq::command_t, 16>::write(zmq::command_t const&,
> bool)
> src/ypipe.hpp:85 (libzmq.so.5+0x000000047f05)
>     #1 zmq::mailbox_t::send(zmq::command_t const&) src/mailbox.cpp:62
> (libzmq.so.5+0x0000000476dc)
>     #2 zmq::ctx_t::send_command(unsigned int, zmq::command_t const&)
> src/ctx.cpp:438 (libzmq.so.5+0x000000018420)
>     #3 zmq::object_t::send_command(zmq::command_t&)
> src/object.cpp:474
> (libzmq.so.5+0x00000005c07b)
>     #4 zmq::object_t::send_plug(zmq::own_t*, bool) src/object.cpp:220
> (libzmq.so.5+0x00000005a7ef)
>     #5 zmq::own_t::launch_child(zmq::own_t*) src/own.cpp:87
> (libzmq.so.5+0x00000006134f)
>     #6 zmq::socket_base_t::add_endpoint(char const*, zmq::own_t*,
> zmq::pipe_t*) src/socket_base.cpp:1006 (libzmq.so.5+0x00000008e081)
>     #7 zmq::socket_base_t::bind(char const*) src/socket_base.cpp:630
> (libzmq.so.5+0x00000008be89)
>     #8 zmq_bind src/zmq.cpp:326 (libzmq.so.5+0x0000000c8735)
>     ...
> 
> 
> I tried forcing use of C++11 internal atomics and all the warnings
> about  zmq::atomic_ptr_t<> disappear (I still get tons about ypipe
> and
> "pipe")..
> maybe it's best to prefer C++11 atomics when available?
> 
> Btw I will try to write a suppression file for ThreadSanitizer and
> ZMQ... I
> just start to doubt: is all ZMQ code really race-free? :)
> 
> 
> Thanks,
> Francesco
> 
> 
> 
> 
> 2018-02-24 16:44 GMT+01:00 Francesco <francesco.montorsi at gmail.com>:
> 
> > Hi Bill,
> > indeed I've found ThreadSanitizer to be more effective i.e., it
> > contains
> > much less false positives compared to Helgrind... still I'm getting
> > several
> > race condition warnings out of libzmq 4.2.3... indeed I actually
> > built only
> > my application code with -fsanitize=thread... do you know if I need
> > to
> > rebuild libzmq with that option as well? I will try to see if it
> > makes any
> > difference!
> > 
> > Thanks,
> > Francesco
> > 
> > 
> > 
> > 
> > 2018-02-24 15:17 GMT+01:00 Bill Torpey <wallstprog at gmail.com>:
> > 
> > > I’m using clang’s Thread Sanitizer for a similar purpose, and
> > > just
> > > happened to notice that the TSAN docs use ZeroMQ as one of the
> > > example
> > > suppressions:  https://github.com/google/san
> > > itizers/wiki/ThreadSanitizerSuppressions
> > > 
> > > I assume that the reason for suppressing libzmq.so has to do with
> > > (legacy) sockets not being thread-safe, so the code may exhibit
> > > race
> > > conditions that are irrelevant given that the code is not
> > > intended to be
> > > called from multiple threads.
> > > 
> > > FWIW, you may want to check out the clang sanitizers — they have
> > > certain
> > > advantages over valgrind — (faster, multi-threaded, etc.) if you
> > > are able
> > > to instrument the code at build time.
> > > 
> > > 
> > > On Feb 23, 2018, at 6:52 AM, Luca Boccassi <luca.boccassi at gmail.c
> > > om>
> > > wrote:
> > > 
> > > On Fri, 2018-02-23 at 12:22 +0100, Francesco wrote:
> > > 
> > > Hi all,
> > > I'm trying to further debug the problem I described in my earlier
> > > mail (
> > > https://lists.zeromq.org/pipermail/zeromq-dev/2018-February/03230
> > > 3.ht
> > > ml) so
> > > I decided to use Helgrind to find race conditions in my code.
> > > 
> > > My problem is that apparently Helgrind 3.12.0 is reporting race
> > > conditions
> > > against zmq::atomic_ptr_t<> implementation.
> > > Now I know that Helgrind has troubles with C++11 atomics but by
> > > looking at
> > > the code I see that ZMQ is not using them (note: I do have
> > > ZMQ_ATOMIC_PTR_CXX11 defined but I also have
> > > ZMQ_ATOMIC_PTR_INTRINSIC
> > > defined, so the latter wins!).
> > > 
> > > In particular Helgrind 3.12.0 tells me that:
> > > 
> > > 
> > > ==00:00:00:11.885 29399==
> > > ==00:00:00:11.885 29399== *Possible data race during read of size
> > > 8
> > > at
> > > 0xB373BF0 by thread #4*
> > > ==00:00:00:11.885 29399== Locks held: none
> > > ==00:00:00:11.885 29399==    at 0x6BD79AB:
> > > *zmq::atomic_ptr_t<zmq::command_t>::cas*(zmq::command_t*,
> > > zmq::command_t*)
> > > (atomic_ptr.hpp:150)
> > > ==00:00:00:11.885 29399==    by 0x6BD7874:
> > > zmq::ypipe_t<zmq::command_t,
> > > 16>::check_read() (ypipe.hpp:147)
> > > ==00:00:00:11.885 29399==    by 0x6BD7288:
> > > zmq::ypipe_t<zmq::command_t,
> > > 16>::read(zmq::command_t*) (ypipe.hpp:165)
> > > ==00:00:00:11.885 29399==    by 0x6BD6FE7:
> > > zmq::mailbox_t::recv(zmq::command_t*, int) (mailbox.cpp:98)
> > > ==00:00:00:11.885 29399==    by 0x6BD29FC:
> > > zmq::io_thread_t::in_event()
> > > (io_thread.cpp:81)
> > > ==00:00:00:11.885 29399==    by 0x6BD05C1: zmq::epoll_t::loop()
> > > (epoll.cpp:188)
> > > ==00:00:00:11.885 29399==    by 0x6BD06C3:
> > > zmq::epoll_t::worker_routine(void*) (epoll.cpp:203)
> > > ==00:00:00:11.885 29399==    by 0x6C18BA5: thread_routine
> > > (thread.cpp:109)
> > > ==00:00:00:11.885 29399==    by 0x4C2F837: mythread_wrapper
> > > (hg_intercepts.c:389)
> > > ==00:00:00:11.885 29399==    by 0x6E72463: start_thread
> > > (pthread_create.c:334)
> > > ==00:00:00:11.885 29399==    by 0x92F901C: clone (clone.S:109)
> > > ==00:00:00:11.885 29399==
> > > ==00:00:00:11.885 29399== This conflicts with a previous write of
> > > size 8 by
> > > thread #2
> > > ==00:00:00:11.885 29399== Locks held: 1, at address 0xB373C08
> > > ==00:00:00:11.885 29399==    at 0x6BD77F4:
> > > *zmq::atomic_ptr_t<zmq::command_t>::set*(zmq::command_t*)
> > > (atomic_ptr.hpp:90)
> > > ==00:00:00:11.885 29399==    by 0x6BD7422:
> > > zmq::ypipe_t<zmq::command_t,
> > > 16>::flush() (ypipe.hpp:125)
> > > ==00:00:00:11.885 29399==    by 0x6BD6DF5:
> > > zmq::mailbox_t::send(zmq::command_t const&) (mailbox.cpp:63)
> > > ==00:00:00:11.885 29399==    by 0x6BB9128:
> > > zmq::ctx_t::send_command(unsigned int, zmq::command_t const&)
> > > (ctx.cpp:438)
> > > ==00:00:00:11.885 29399==    by 0x6BE34CE:
> > > zmq::object_t::send_command(zmq::command_t&) (object.cpp:474)
> > > ==00:00:00:11.885 29399==    by 0x6BE26F8:
> > > zmq::object_t::send_plug(zmq::own_t*, bool) (object.cpp:220)
> > > ==00:00:00:11.885 29399==    by 0x6BE68E2:
> > > zmq::own_t::launch_child(zmq::own_t*) (own.cpp:87)
> > > ==00:00:00:11.885 29399==    by 0x6C03D6C:
> > > zmq::socket_base_t::add_endpoint(char const*, zmq::own_t*,
> > > zmq::pipe_t*)
> > > (socket_base.cpp:1006)
> > > ==00:00:00:11.885 29399==  Address 0xb373bf0 is 128 bytes inside
> > > a
> > > block of
> > > size 224 alloc'd
> > > ==00:00:00:11.885 29399==    at 0x4C2A6FD: operator new(unsigned
> > > long,
> > > std::nothrow_t const&) (vg_replace_malloc.c:376)
> > > ==00:00:00:11.885 29399==    by 0x6BB8B8D:
> > > zmq::ctx_t::create_socket(int)
> > > (ctx.cpp:351)
> > > ==00:00:00:11.885 29399==    by 0x6C284D5: zmq_socket
> > > (zmq.cpp:267)
> > > ==00:00:00:11.885 29399==    by 0x6143809:
> > > ZmqClientSocket::Config(PubSubSocketConfig const&)
> > > (ZmqRequestReply.cpp:303)
> > > ==00:00:00:11.885 29399==    by 0x6144069:
> > > ZmqClientMultiSocket::Config(PubSubSocketConfig const&)
> > > (ZmqRequestReply.cpp:407)
> > > ==00:00:00:11.885 29399==    by 0x61684EF:
> > > client_thread_main(void*)
> > > (ZmqRequestReplyUnitTests.cpp:132)
> > > ==00:00:00:11.886 29399==    by 0x4C2F837: mythread_wrapper
> > > (hg_intercepts.c:389)
> > > ==00:00:00:11.886 29399==    by 0x6E72463: start_thread
> > > (pthread_create.c:334)
> > > ==00:00:00:11.886 29399==    by 0x92F901C: clone (clone.S:109)
> > > ==00:00:00:11.886 29399==  Block was alloc'd by thread #2
> > > 
> > > 
> > > Is this a known (and ignorable) issue with  zmq::atomic_ptr_t<>?
> > > 
> > > Thanks,
> > > Francesco
> > > 
> > > 
> > > Yeah I started trying to put together a suppression file but
> > > never
> > > finished it:
> > > 
> > > https://github.com/bluca/libzmq/commit/fb9ee9da7631f9506cbfc
> > > d6db29a284ae6e9651e
> > > 
> > > Hope to have time to finish working on it eventually (feel free
> > > to
> > > contribute!) as it's very noisy right now, as it can't know about
> > > our
> > > lock-free queue implementation without the custom suppression
> > > file
> > > 
> > > --
> > > Kind regards,
> > > Luca Boccassi_______________________________________________
> > > zeromq-dev mailing list
> > > zeromq-dev at lists.zeromq.org
> > > https://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > 
> > > 
> > > 
> > > _______________________________________________
> > > zeromq-dev mailing list
> > > zeromq-dev at lists.zeromq.org
> > > https://lists.zeromq.org/mailman/listinfo/zeromq-dev
> > > 
> > > 
> 
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev

-- 
Kind regards,
Luca Boccassi
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: This is a digitally signed message part
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20180224/8f80b9d8/attachment.sig>


More information about the zeromq-dev mailing list