[zeromq-dev] Router - router communication

Pieter Hintjens ph at imatix.com
Sun Mar 27 10:32:05 CEST 2016


Sorry for the slow answer on this.

The problem is that exiting at one end does not kill the pipe that the
other end holds for that connection. So if you send a message again,
it will go into the pipe, and then be effectively lost.

You will need to build a more explicit protocol, using dealer-router
sockets. The standard pattern is one router talking to N dealers, with
state for each dealer. You can see examples of this in the Guide (e.g.
FileMQ). You can then handshake a closing connection, and be
absolutely sure you never send messages to a peer that is gone. You
can go further and acknowledge messages, either individually, or in
batches.

-Pieter

On Mon, Mar 21, 2016 at 6:02 PM, pulkit <pulkit at cs.duke.edu> wrote:
> Hi,
>
> I am building a Replicated State Machine (RSM) protocol and using
> ZeroMQ's router sockets for NxN communication between the nodes. All the
> nodes have pre-determined identities, so I use 1 router socket per node
> for all communication. Regarding the version, I am using ZeroMQ version
> 4.2.0 and Ubuntu 14.04.4 OS.
>
> I am having trouble with lost messages during failover, it doesn't
> happen always but a few times when I bring down a primary node (a clean
> exit which unbinds and disconnects from all endpoints), the backup nodes
> are unable to talk to each other to intiate the failover. A zmq_msg_send
> call at the source doesn't return any error but the destination never
> receives the message. I would appreciate any insight on what could be
> going wrong.
>
> -Pulkit
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev



More information about the zeromq-dev mailing list