[zeromq-dev] Server Socket Initiate Communication

Bill Torpey wallstprog at gmail.com
Sat Feb 10 18:26:34 CET 2018

Let me see if I can explain this from the perspective of pub/sub messaging, as that is what I am familiar with.

With pub/sub messaging “clients” subscribe to topics and “servers” publish messages on those topics.  (The use of “client” and “server” here is somewhat arbitrary, but I believe it’s in line with the way most people think of these roles).

As of ZeroMQ version 2, filtering of pub/sub messages is done by the publisher.  (Prior to that, filtering was done by the subscriber — i.e., “servers" published everything to all “clients”.  As you may imagine, that turned out to be inefficient in the case where are particular subscriber was only interested in a small subset of messages).

Where this gets confusing is when we consider which end of the conversation does the bind and which does the connect.   Theoretically it should make no difference, but because of an implementation choice in ZeroMQ, it in fact does make a difference, and here’s how.

When a subscriber connects to a publisher, the subscriber’s subscription information (the topics that it subscribes to) is sent to the publisher as part of the handshake that ZeroMQ uses to establish the connection.  So, as soon as the subscriber is connected, the publisher has all the information it needs to filter messages for that subscriber.

By contrast, when a publisher connects to a subscriber, an additional step is necessary — the initial handshake and connect takes place as above, and then a separate exchange of subscription information happens between the subscriber and the publisher.  The effect of this is that any messages published between those two points that would have been sent to the subscriber are not, because the publisher doesn’t (yet) know that the subscriber is interested in them.  In practice, this means that the subscriber may “miss” the first “n” messages that were published after it connected.

There is some discussion in the docs (e.g., https://github.com/zeromq/libzmq/issues/2267) about calling zmq_poll on the subscriber to trigger the exchange of subscription information, but that suggestion is a red herring in most cases.  The subscriber needs to call zmq_poll after the publisher connects to it, but in most real-world cases the subscriber has no way to know when that happens.  The suggested workaround (https://gist.github.com/hintjens/7344533) is an artificial example where the publisher and subscriber are in the same process, but in the real world that is almost never the case.

Short version: if the subscriber connects to the publisher, then everything is fine.  

OTOH if the publisher connects to the subscriber, it is likely that the subscriber will “miss” some number of initial messages that it would reasonably expect to receive. In many situations that is not a big deal — the subscriber is just tapping into the stream of messages at a point somewhat later than it might expect.  

However, if it is a big deal for your application you’ll need to ensure that subscribers connect and publishers bind, or you’ll need to deal with the fact that your subscriber may miss some messages at connect time (e.g., by implementing some kind of application-specific handshaking on top of ZeroMQ).

There’s been discussion off and on about fixing this in the ZeroMQ code, but it sounds like a major change and so far no one has stepped up to the plate.

Hope this helps...

> On Feb 10, 2018, at 10:29 AM, Matthew Harrigan <harrigan.matthew at gmail.com> wrote:
> I apologize in advance if this is a dumb question, but would still appreciate knowing why. So, why can't a server socket initiate communication? I am focused on this documentation http://api.zeromq.org/4-2:zmq-socket <http://api.zeromq.org/4-2:zmq-socket>. AFAICT the only state change that happens in a server socket when a client socket connects to it is that the map that relates addresses to routing id's get updated. Suppose a machine with a server socket knows what address it want to communicate with, whether hard coded, from a beacon, service locator, whatever. Why can't it just tell the server socket to make a routing id to that address and then start communicating? Note I am working on a toy educational project where a gaggle of peers uses raft to get a leader and formally join or leave (different from crashing or network partitions). In such a peer to peer network the symmetry of each peer binding a server socket to itself and talking to other peers seems appealing. Thank you
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20180210/ac5b340c/attachment.htm>

More information about the zeromq-dev mailing list