[zeromq-dev] PUSH/PULL lost messages (once again)

Jean-François Smigielski jf.smigielski at gmail.com
Fri Mar 8 11:42:04 CET 2013


Dear 0MQ expert and users!

I need to process flows of events in a distributed one-way workflow
fashion, i.e. exactly what PUSH/PULL seems to be made for.

To perform some tests, I first tried to bind the PULL sides and then
connect the PUSH sides. This worked fine in the nominal case.

To check error cases, in the same test program (here below) I voluntarily
connected a PUSH'er to 2 tcp URL, one valid and one that never existed (and
won't ever). This is my way to reproduce what happens if I register the
PULL sockets in a registrar like ZooKeeper, and if some PUSH sockets are
notified after the PULL socket disappears. In my case, the list of workers
will change, they can move from hosts to others hosts, according to the
needs.

When pulling, I observe that I get only ~50% of the messages sent, and it
seems that some messages were dispatched to the "unconnnectable" socket. If
I do not set any LINGER mechanism or do not explicitly disconnect the PUSH
socket, the context destruction hangs forever. If I do, messages are lost.

I am wondering why 0MQ attempts to dispatch messages to a socket that is
not really connected, and why those message are not sent to another really
connected peer. I am also wondering why the number of messages nearly-lost
is not limited by the SNDHWM on this socket.

I guess I misunderstood something in the guide, leading me to an
unsatisfiable expectation on the behavior. But I would like a confirmation.
I you know a best-practice, or a good link about this, I'll take it with
great pleasure.

Thanks a lot!
--
Jean-François SMIGIELSKI

-------8<-----------

int
main(int argc, char **argv)
{
    int rc;
    assert(argc == 3);

    void *ctx = zmq_ctx_new(); assert(ctx != NULL);

    void *sin = zmq_socket(ctx, ZMQ_PULL); assert(sin != NULL);
    rc = zmq_bind(sin, argv[1]); assert(rc == 0);

    void *sout = zmq_socket(ctx, ZMQ_PUSH); assert(sout != NULL);
    rc = zmq_connect(sout, argv[1]); assert(rc == 0);
    rc = zmq_connect(sout, argv[2]); assert(rc == 0);

    // tested here: RCVHVM + SNDHWM + LINGER

    for (int i=0; i<5 ;i++) { push(sout); }
    for (int i=0; i<30 ;i++) { pull(sin); }

    // tested here: zmq_disconnect(sout, ...)

    zmq_close(sout);
    zmq_close(sin);
    zmq_ctx_destroy(ctx);
    fprintf(stderr, "Never seen due to LINGER mechanism\n");
    return 0;
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20130308/a5200775/attachment.htm>


More information about the zeromq-dev mailing list