[zeromq-dev] inproc: passing an object pointer between threads causing datarace?? (3.2.2 stable release)
Kah-Chan Low
kahchanlow at yahoo.com
Tue Nov 27 22:23:28 CET 2012
Earlier this year someone floated the idea of passing an object pointer between threads in the context of "inproc" protocol and there was broad agreement that it could be done with no problem:
http://comments.gmane.org/gmane.network.zeromq.devel/7300
Right now I am in similar situation: I don't want to sacrifice speed by needlessly serializing and de-serializing my objects.
However, I am now very worried because Valgrind flags data race errors in my code due to my passing of pointers around.
I wrote the following simple program as an illustration:
#include <boost/thread.hpp>
#include <boost/bind.hpp>
#include <algorithm>
#include <zmq.hpp>
#include <zmq_utils.h>
#include <iostream>
using namespace std;
void dealer2router(zmq::context_t& context, size_t limit)
{
zmq::socket_t skDealer(context, ZMQ_DEALER);
int hwm=0;
skDealer.setsockopt(ZMQ_SNDHWM, &hwm, sizeof(hwm));
skDealer.connect("inproc://i_love_zmq");
for (size_t i=0; i<limit; i++) {
int32_t *n = new int32_t; // this line is pass_pointer:19
*n = 4;
*n += 2; // assign value of *n to 6, then
skDealer.send(&n, sizeof(int32_t*)); // send the pointer
}
}
void router_do_smth(zmq::context_t& context, size_t limit)
{
zmq::socket_t skRouter(context, ZMQ_ROUTER);
int hwm=0;
skRouter.setsockopt(ZMQ_RCVHWM, &hwm, sizeof(hwm));
skRouter.bind("inproc://i_love_zmq");
for (size_t i=0; i<limit; i++) {
int32_t *n;
zmq::message_t msg;
skRouter.recv(&msg);
skRouter.recv(&n, sizeof(n));
cout << "I got " << *n << '\n'; // this line is pass_pointer:38
delete n;
}
}
int main (int argc, char *argv[])
{
zmq::context_t context(1);
boost::thread_group threads;
const size_t iter=100;
threads.create_thread(boost::bind(router_do_smth, boost::ref(context), iter));
zmq_sleep(5);
threads.create_thread(boost::bind(dealer2router, boost::ref(context), iter));
threads.join_all();
}
Valgrind (drd) issues the following data race errors:
==29337== Conflicting load by thread 2 at 0x04c8ad50 size 4
==29337== at 0x411B0E: router_do_smth(zmq::context_t&, unsigned long) (pass_pointer.cxx:38)
==29337== by 0x418083: thread_proxy (in /home/user/klow/zmqMR/zmqmr/bin/64-rhel6/pass_pointer)
==29337== by 0x4A12470: vgDrd_thread_wrapper (drd_pthread_intercepts.c:281)
==29337== by 0x3ABFE07850: start_thread (in /lib64/libpthread-2.12.so)
==29337== by 0x3ABFAE767C: clone (in /lib64/libc-2.12.so)
==29337== Address 0x4c8ad50 is at offset 0 from 0x4c8ad50. Allocation context:
==29337== at 0x4A0893E: operator new(unsigned long) (vg_replace_malloc.c:261)
==29337== by 0x411E25: dealer2router(zmq::context_t&, unsigned long) (pass_pointer.cxx:19)
==29337== by 0x418083: thread_proxy (in /home/user/klow/zmqMR/zmqmr/bin/64-rhel6/pass_pointer)
==29337== by 0x4A12470: vgDrd_thread_wrapper (drd_pthread_intercepts.c:281)
==29337== by 0x3ABFE07850: start_thread (in /lib64/libpthread-2.12.so)
==29337== by 0x3ABFAE767C: clone (in /lib64/libc-2.12.so)
==29337== Other segment start (thread 5)
==29337== at 0x4A13613: pthread_mutex_unlock (drd_pthread_intercepts.c:640)
==29337== by 0x422999: zmq::mailbox_t::send(zmq::command_t const&) (mutex.hpp:101)
==29337== by 0x423B83: zmq::object_t::send_bind(zmq::own_t*, zmq::pipe_t*, bool) (object.cpp:211)
==29337== by 0x4290C1: zmq::socket_base_t::connect(char const*) (socket_base.cpp:476)
==29337== by 0x411DF4: dealer2router(zmq::context_t&, unsigned long) (zmq.hpp:341)
==29337== by 0x418083: thread_proxy (in /home/user/klow/zmqMR/zmqmr/bin/64-rhel6/pass_pointer)
==29337== by 0x4A12470: vgDrd_thread_wrapper (drd_pthread_intercepts.c:281)
==29337== by 0x3ABFE07850: start_thread (in /lib64/libpthread-2.12.so)
==29337== by 0x3ABFAE767C: clone (in /lib64/libc-2.12.so)
==29337== Other segment end (thread 5)
==29337== at 0x4A14650: pthread_mutex_lock (drd_pthread_intercepts.c:587)
==29337== by 0x422909: zmq::mailbox_t::send(zmq::command_t const&) (mutex.hpp:95)
==29337== by 0x423B28: zmq::object_t::send_reap(zmq::socket_base_t*) (object.cpp:290)
==29337== by 0x427435: zmq::socket_base_t::close() (socket_base.cpp:758)
==29337== by 0x41E40F: zmq_close (zmq.cpp:251)
==29337== by 0x411EC7: dealer2router(zmq::context_t&, unsigned long) (zmq.hpp:311)
==29337== by 0x418083: thread_proxy (in /home/user/klow/zmqMR/zmqmr/bin/64-rhel6/pass_pointer)
==29337== by 0x4A12470: vgDrd_thread_wrapper (drd_pthread_intercepts.c:281)
==29337== by 0x3ABFE07850: start_thread (in /lib64/libpthread-2.12.so)
==29337== by 0x3ABFAE767C: clone (in /lib64/libc-2.12.so)
This error can not be suppressed by using CPPFLAGS=-DZMQ_MAKE_VALGRIND_HAPPY to build zmq.
I am usingv3.2.2 stable release.
I have two questions:
1. I don't understand the error message above: how does accessing the value of variable n have anything to do with the internals of ZMQ?
2. Is it possible for the compiler to rearrange the machine instructions such that the assignment of value of n (i.e. *n=4; *n += 2;) occurs after the message has been sent and received on the other thread? In this case, is it possible for the program to output any value other than '6' for 'n'?
Thanks!
kc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20121127/3f5621b2/attachment.htm>
More information about the zeromq-dev
mailing list