[zeromq-dev] Pipeline Reliability
Mark Botner
mbotner at gmail.com
Tue Jul 3 17:35:31 CEST 2018
I wonder if the default setting for ZMQ_LINGER is causing the zmq_close()
to block since there are unsent messages? From the ref. guide:
The default setting of *ZMQ_LINGER* does not discard unsent messages; this
behaviour may cause the application to block when calling *zmq_ctx_term()*.
Mark
On Tue, Jul 3, 2018 at 9:08 AM, Charles Bouillaguet <
charles.bouillaguet at gmail.com> wrote:
> Dear zeromq'ers,
>
> I'm facing a reliability problem that I couldn't solve by myself so far.
>
> I have two machines with two asymetric programs running. Machine A creates
> a
> PULL socket and binds it. Machine B creates a PUSH socket and connects it
> (to
> the PULL socket of machine A), using the TCP transport. Machine B then
> sends
> messages like crazy (about 500/s). Basically, B is an low-cost device
> equiped
> with sensors and A is a server that just stores the data.
>
> This works like a charm... until the inevitable happens: some network event
> occurs, and the messages cannot be transmitted from machine B to machine A.
>
> With a blocking send, the process on machine B then gets stuck in
> zmq_send(),
> once the high water mark is reached, and the whole pipeline grinds to a
> halt.
>
> To avoid this, I tried the "Lazy Pirate Pattern". I use something like:
>
> if (-1 == zmq_send(socket, msg, size, ZMQ_DONTWAIT)) {
> if (errno == EAGAIN) {
> zmq_close(socket);
> socket = zmq_socket(context, ZMQ_PUSH);
> zmq_connect(socket, address);
> }
> }
>
> I don't care if I lose some messages. What I don't want is the pipeline to
> stop
> forever.
>
> At first, this seems to work as intended. When the network is down, the
> program
> actually closes and re-creates the socket; the call to zmq_connect()
> succeeds... but the messages are still not sent, and the process in
> machine B
> ends up in a loop where it fills the ZMQ buffers, destroy the socket,
> re-create
> it, re-connect, rinse, repeat. I observed the loop for several hours.
>
> Just stopping the UNIX process and re-starting it solved the problem
> (i.e. messages get transmitted normally, instantaneously).
>
> Is there something I am doing wrong? What are my options to avoid this
> problem?
> [I can consider moving away from ZMQ to nanomsg or nng].
>
> Thanks,
> --
> Charles BOUILLAGUET
> Université de Lille - Sciences et Technologies
> charles.bouillaguet at univ-lille1.fr | www.univ-lille1.fr
> Laboratoire CRIStAL - Bât M3 - Bureau 332 - 59655 Villeneuve d'Ascq
> Tél. +33 (0)3 28 77 85 84
> homepage: http://cristal.univ-lille.fr/~bouillag/
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20180703/57fb6e73/attachment.htm>
More information about the zeromq-dev
mailing list