[zeromq-dev] Questions about socket errors

Brian Granger ellisonbg at gmail.com
Wed Feb 10 23:48:33 CET 2010


Mato,

Thanks for the quick reply.  It helps to understand the overall philosophy.

> Actually, this behaviour is normal and is not unique to the Python
> bindings.
>
> Calling zmq_connect() (which is what s.connect() does) just means "please
> try to connect asynchronously, now or later". You will only get an error if
> the endpoint is *invalid* (e.g. Host doesn't resolve, etc.), not if the
> other end is not present.

OK, that makes sense - different from what I am used to but probably OK...

> Same goes for recv/send -- 0MQ does autoreconnect and both recv/send are
> entirely asynchronous. So if the other end goes away your data will get
> sent once it comes back.

What if the other end never comes back.  Is there a way of clearing
the queue of messages
that would have been delivered to that endpoint.  I guess it depends
on the type of socket right?
I would imagine that a socket type that round robin distributes to a
set of endpoints, would just skip
any endpoint that disconnects?  What about reply/request queues or multicast?

> We realise that there are many use cases where people do want to know if
> a peer is present at least for those transports where it makes sense but
> the implications of doing this properly (which means e.g. synchronous
> zmq_send() which defeats queueing and batching, etc.) need more thought.

OK, I think I see why you are thinking of of a synchronous send now.
This is pretty subtle as we definitely
want things to be asynchronous.  What I am thinking of is sort of
"delivery confirmation" that itself is
asynch.  Imagine send having a callback that would be called upon
message delivery or failure.  Or it could
return an object like a deferred
(http://twistedmatrix.com/documents/current/core/howto/defer.html).
At some
level, the underlying networking code does hit errors in these cases,
and those errors are asynch.  The question
is how to represent them in the calling code (that does recv/send).

> I would suggest implementing a "ping" function at the application level.
> Send a message every X seconds and if you don't get a reply within Y
> seconds then take evasive action.

Yes, with low latency, this might be a great option.  But, still there
has to be someway of handling
messages that won't ever be sent because the receiving endpoint has truly died.

> Oh, and yes, this needs to be explained much better in the documentation.
> I'm working on that...

Great!

Cheers,

Brian

> Cheers,
>
> -mato
>



-- 
Brian E. Granger, Ph.D.
Assistant Professor of Physics
Cal Poly State University, San Luis Obispo
bgranger at calpoly.edu
ellisonbg at gmail.com



More information about the zeromq-dev mailing list