[zeromq-dev] Question about zmq_socket_monitor()

Luca Boccassi luca.boccassi at gmail.com
Mon Jan 7 13:17:52 CET 2019


On Mon, 2019-01-07 at 02:49 +0000, Yan, Liming (NSB - CN/Hangzhou)
wrote:
> Hi,
>   I also got this error some months ago. I’m indeed using monitor
> socket to maintain the node state by monitoring connect/disconnect
> event, although author said ‘that socket monitoring was intended to
> be used for logging/troubleshooting, not for state/control
> flow.’.  These 3 steps are similar with what I have done.
> 1) zmq_socket_monitor(my_xsub_sock, "inproc://starship-enterprise",
> ZMQ_EVENT_ALL);
> 2) zmq_socket_monitor(my_xsub_sock, NULL, 0);
> 3) zmq_socket_monitor(my_xsub_sock, "inproc://starship-
> enterprise",ZMQ_EVENT_ALL);
>   As described the close() is asynchronous.  The pair socket is not
> freed yet. I think it needs some event to trigger the releasing in
> process_command().  Just try to send and empty message to parent
> socket (here it’s the XPUB) after step 2. I did this and seems the
> issue solved.  Of course you need to drop the empty message on XPUB
> in somewhere.
>    BTW, the step 2 is not documented officially. I didn’t see this
> description such as ‘With endpoint NULL, it will stop the previous
> monitor socket anyway’. I just checked code and got this.  Not sure
> if this is officially supported?
> int zmq::socket_base_t::monitor (const char *addr_, int events_)
> {
>     //  Support deregistering monitoring endpoints as well
>     if (addr_ == NULL) {
>         stop_monitor ();
>         return 0;
>     }
> 
> BR
> Yan Limin
> 
> From: zeromq-dev [mailto:zeromq-dev-bounces at lists.zeromq.org] On
> Behalf Of Bill Torpey
> Sent: Friday, December 14, 2018 2:24 AM
> To: ZeroMQ development list <zeromq-dev at lists.zeromq.org>
> Subject: Re: [zeromq-dev] Question about zmq_socket_monitor()
> 
> In my experience, ZMQ_PAIR sockets exhibit some problems that I
> previously documented here: https://github.com/zeromq/libzmq/issues/2
> 759, specifically around disconnecting and reconnecting.
> 
> In particular, see this reply from Simon: https://github.com/zeromq/l
> ibzmq/issues/2759#issuecomment-389057453
> 
> I now think that this might be problematic because disconnect works
> asynchronously, and a new connect would be allowed only after the
> disconnect completed. You could add a socket monitor to wait for the
> ZMQ_EVENT_DISCONNECTED event to synchronize this.
> 
> This gibes with Luca’s explanation.
> 
> (The bit about using socket monitor is of course not going to help in
> this case, since you’re trying to troubleshoot the socket monitor
> itself ;-)  On top of that, I’ve had conversations with the original
> implementor of socket monitoring, who has said that socket monitoring
> was intended to be used for logging/troubleshooting, not for
> state/control flow.  This makes sense when you understand that
> monitor events are themselves asynchronous with respect to the
> underlying socket operations — for instance see the caveats in the
> doc:
> ZMQ_EVENT_CONNECTED
> The socket has successfully connected to a remote peer. The event
> value is the file descriptor (FD) of the underlying network socket.
> Warning: there is no guarantee that the FD is still valid by the time
> your code receives this event.
> I don’t know what you’re trying to accomplish with socket monitor,
> but it may not be the best choice.
> 
> 
> On Dec 12, 2018, at 8:50 AM, Luca Boccassi <luca.boccassi at gmail.com<m
> ailto:luca.boccassi at gmail.com>> wrote:
> 
> On Tue, 2018-12-11 at 22:07 +0000, Martin.Belanger at dell.com<mailto:Ma
> rtin.Belanger at dell.com> wrote:
> 
> Hi,
> 
> I'm experimenting with zmq_socket_monitor(). I have a XSUB socket
> that I'm monitoring. The monitoring works fine. I'm just trying to
> understand how I can enable/disable monitoring back-and-forth by
> calling zmq_socket_monitor().
> 
> I tried to enable/disable/re-enable monitoring by invoking
> zmq_socket_monitor() as shown below I get the error "Address already
> in use" on the 3rd invocation.
> 
> 1) zmq_socket_monitor(my_xsub_sock, "inproc://starship-enterprise",
> ZMQ_EVENT_ALL);
> 2) zmq_socket_monitor(my_xsub_sock, NULL, 0);
> 3) zmq_socket_monitor(my_xsub_sock, "inproc://starship-enterprise",
> ZMQ_EVENT_ALL); <- ERROR
> 
> I followed the source code for zmq_socket_monitor() to a method
> called zmq::socket_base_t::monitor(). That function eventually calls
> zmq_bind() where I believe the error occurs.
> 
> 1) The first time zmq_socket_monitor() is called a ZMQ_PAIR socket is
> created (_monitor_socket) and bound to address "inproc://starship-
> enterprise". Note that this socket is created with ZMQ_LINGER=0.
> 
> 2) The second time zmq_socket_monitor() is invoked to disable
> monitoring _monitor_socket is closed and because ZMQ_LINGER=0 it
> should go out of existence right away. Right?
> 
> 3) The third time zmq_socket_monitor() is called I get "Address
> already in use" as if the old socket is still there. How can that be?
> 
> Regards,
> Martin
> 
> Without looking with gdb, my best guess is a race: the close is
> asynchronous and non-blocking, so the third call might try to re-use
> the same endpoint that is still technically in use.
> IIRC inproc sockets events are processed from the "connecting"
> socket's
> thread, so try to run something (ie: zmq_getsockopt zmq_events) on
> the
> connecting inproc before the third call and see if that helps.

It's not documented as, as you've seen, there are some issues with
doing that, and your guess is indeed correct.

The simplest and most fool-proof solution is to use a different
endpoint name. Or simply expect the possible error and retry later.

Also I _think_ running getsockopt once or twice should be enough to let
the application thread process commands, rather than sending messages.
But your mileage might vary.

-- 
Kind regards,
Luca Boccassi
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: This is a digitally signed message part
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20190107/4746439e/attachment.sig>


More information about the zeromq-dev mailing list