[zeromq-dev] PUB/SUB missed initial message even using monitor events (possibly just #2267, but not sure)

Bill Torpey wallstprog at gmail.com
Tue Feb 7 22:38:55 CET 2023

Hey Jason:

Btw, I’m assuming connection-oriented (e.g., TCP) transport here.  Semantics could be very different w/other mechanisms.

> On Feb 6, 2023, at 8:10 PM, Jason Heeris <jason.heeris at gmail.com> wrote:
> Because my pub sockets always bind, and my sub sockets always
> connect.

Then everything should be fine (but it’s not).

> But in these tests, the "sub" process makes its Socket::connect() call
> earlier in time than the "pub" process makes its Socket::bind() call, to the
> same endpoint.

So, the initial connect should timeout — correct?  

I’m pretty sure that if the connect happens before the bind, then the connect will simply return ECONNREFUSED, which indicates that there is no code listening at the specified port.  By default, zmq will continue trying to connect to that port every 100ms.  

It looks like your code calls socket_monitor *after* the bind/connect calls — it’s better to start monitoring immediately after the create in order to see what is going on with the connect/bind calls.  I’m not a “rustacean” myself, but it looks like you’re missing some events given the way you sequence the calls to monitor.

You could also try using zmtdump (https://github.com/zeromq/zmtpdump <https://github.com/zeromq/zmtpdump>) to see what is going on at the packet level — it hasn’t been updated in a while, but I still find it useful.  Or, tcpdump and wireshark (which has a ZMTP decoder).

BTW, I know this doesn’t answer your question as to why this is happening, but a very helpful feature in zmq is the “welcome” msg — see here (https://web.archive.org/web/20160208000728/http://somdoron.com/2015/09/reliable-pubsub/ <https://web.archive.org/web/20160208000728/http://somdoron.com/2015/09/reliable-pubsub/>) and here (https://github.com/somdoron/ReliablePubSub <https://github.com/somdoron/ReliablePubSub>).   OZ uses this to know for sure when a sub is connected to a pub.  You might also find some of this info helpful: https://github.com/nyfix/OZ/blob/master/doc/Reconnects-Heartbeats.md <https://github.com/nyfix/OZ/blob/master/doc/Reconnects-Heartbeats.md>.

Hope this helps.



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20230207/e7160a23/attachment.htm>

More information about the zeromq-dev mailing list