[zeromq-dev] Dropped messages when not sleeping, using 0MQ 2.1.x

James Cipar jcipar at cmu.edu
Sun Sep 18 17:52:42 CEST 2011

    I'm using PUSH/PULL sockets in a system where I have many pushers connected to a single puller.  The pushers connect, send a few hundred messages, then send a special "end" message and close the socket.  The puller counts how many "end" messages it has received, and once it has enough, it quits.

I'm having a problem where the "end" messages are occasionally dropped (and perhaps other messages as well).  However, if I put a "sleep(1)" immediately before the "close()" call, it works as expected, and all messages arrive.  I am using 0MQ 2.1.9 (and also tried on 2.1.7) on Debian Squeeze..  I thought that the sleep before close was no longer necessary on 2.1.x.  It also seems dependent on the data set that the pushers are sending.  Sometimes it works without the sleep, and sometimes it does not.  Strangely, it is the *smaller* data set that causes problems.  I'd like to avoid the sleep call, because I want the pushers to go on to other work as soon as they finish sending data.

Unfortunately, I'm having trouble constructing a minimal test case.  As it's a large system, and the occurrence of the error seems dependent on the data being sent.  Here is the relevant code for the error:

    int linger;
    size_t l_size = sizeof(linger);
    sender.getsockopt(ZMQ_LINGER, (void *)(&linger), &l_size);
    assert(l_size == sizeof(linger));
    cout <<"closing sender, messages will linger for "<<linger<<" milliseconds\n";

If that "sleep" call is commented out, it will drop messages; with the sleep, it will not.  The "cout" is printing -1, as expected.

Is there any reason to expect this system to be dropping messages?

More information about the zeromq-dev mailing list