[zeromq-dev] Race condition in cppzmq zmq::socket_t::close() / cppzmq tag?

Eric Pederson ericacm at gmail.com
Thu Dec 15 17:32:45 CET 2016


>
> Note that as the documentation also says you must close the sockets before
> destroying the context, not after


If I closed the sockets that were being used in another thread then I would
definitely be in danger of race conditions.

The documentation says to close the sockets in your thread, then close the
context.  Then catch the error in the other thread and close the sockets in
that thread.

The section:

First, do not try to use the same socket from multiple threads. Please
> don't explain why you think this would be excellent fun, just please don't
> do it. Next, you need to shut down each socket that has ongoing requests.
> The proper way is to set a low LINGER value (1 second), and then close the
> socket. If your language binding doesn't do this for you automatically when
> you destroy a context, I'd suggest sending a patch.
>


> Finally, destroy the context. This will cause any blocking receives or
> polls or sends in attached threads (i.e., which share the same context) to
> return with an error. Catch that error, and then set linger on, and *close
> sockets in that thread*, and exit.


So I'm doing the dance exactly as the documentation suggests.  There is no
getting around that the destructor will run in a separate thread from the
one that some sockets were running in.  But as long as you wait for that
thread to end, and those sockets will be closed anyway, you should be safe
to run the destructor.

All that said it's probably a good idea to have atomic closing to make life
safer.  I read that there are thread-safe sockets on the horizon
(ZMQ_CLIENT and ZMQ_SERVER)?  It will be useful for those.


-- Eric

On Thu, Dec 15, 2016 at 2:39 AM, Eric Pederson <ericacm at gmail.com> wrote:

> Right - normally I would not access a socket from two threads.
>
> The scenario is unit testing. We have a class that encapsulates a set of
> sockets and also a thread with a loop that polls the sockets.
>
> When the test is over, I do the following to clean it up:  close the
> context so the poll blocked in the thread throws an exception with ETERM.
> I close the sockets after catching the exception and returning from the
> thread.  The destructor joins the thread so the object doesn't get
> destroyed before the thread is finished.  Then the destructor continues and
> the socket members get destroyed.  All this works fine normally because the
> sockets are usually closed before the socket destructor runs.  But
> sometimes the closing in the thread would race with the destruction.
>
> I found that it is because some tests were not joining the thread but
> instead detaching it.  So it looks like we have a nice deterministic
> shutdown process finally.
>
> Thanks,
>
>
> That could only happen if the same socket was used or managed from multiple
>> threads. As the documentation clearly states, sockets are intentionally
>> not
>> thread safe and must not be used from multiple threads.
>
>
> -- Eric
>
> On Wed, Dec 14, 2016 at 2:24 AM, Eric Pederson <ericacm at gmail.com> wrote:
>
>> I ran into a race condition in zmq::socket_t::close() in cppzmq:
>>
>>
>> *inline void *close() ZMQ_NOTHROW
>> {
>>     *if*(ptr == NULL)
>>
>> *// already closed        **return *;
>>
>> *// Someone can sneak in here and close the socket*
>>     *int *rc = zmq_close (ptr);
>>     ZMQ_ASSERT (rc == 0);
>>     ptr = 0 ;
>> }
>>
>> I worked around it using compare-and-swap to atomically check if it's
>> already closed. Does that kind of fix sound reasonable for
>> zmq::socket_t::close()?
>>
>>
>> Also I wanted to follow up on my previous question about tagging cppzmq.
>> I think that conversation got lost in the list shutdown that happened
>> recently.  Does it make sense to regularly tag the cppzmq repo with the
>> corresponding ZeroMQ version?
>>
>> Thanks,
>>
>> -- Eric
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20161215/617d002b/attachment.htm>


More information about the zeromq-dev mailing list