[zeromq-dev] Handling disconnections; was:Questions about socket errors

Martin Sustrik sustrik at 250bpm.com
Thu Feb 11 19:35:23 CET 2010

Hi Brian,

> Thanks, replies inline below...
>> Exactly. Specifying message type determines the algorithm used to handle
>>  broken connections. In case of PUB/SUB the messages are simply dropped once
>> the queue overflows. In cacase of REQ/REP unaccessible connections are
>> skipped. Once there's no accessible connection and queue limits are reached,
>> send function will block.
> OK, this clarifies the behavior for the different types of sockets.
> But, isn't it a bit
> dangerous to have send block in when the queue fills up.  I would
> think there needs to
> be someway for the application logic to learn about this and remedy
> the situation.

There is a way. Do a non-blocking send. If the queue is full, it'll 
return EAGAIN.

>>> I would imagine that a socket type that round robin distributes to a
>>> set of endpoints, would just skip
>>> any endpoint that disconnects?  What about reply/request queues or
>>> multicast?
>> REQ/REP doesn't work over multicast right now. I haven't seen a compelling
>> use case for the functionality by the way. If you have one, please do share
>> it.
> Sorry, I didn't mean REQ/REP over multicast.  I don't currently see a
> use case for this.
>> 1. It should be made clear what 'disconnection' means. On networking level
>> there are no disconnections. There are only packets either getting through
>> or not getting through. Disconnection can mean various things:
>> a.) I've sent a packet and haven't got ACK for N seconds.
>> b.) I've sent a message and haven't got ACK for N seconds.
>> c.) I've sent a message and the peer application haven't acknowledged that
>> it have processed the transaction for N seconds.
>> d.) There were no data received from the peer for N seconds (heartbeats).
>> etc.
> I am more thinking at the level of tcp, where the various socket calls
> can return a range
>  of error codes that indicate something went wrong with the
> connection.  Obviously
> zeromq is handling those errors codes underneath it all.
>> 2. When should the disconnection notification be delivered?
>> a.) Immediately when it happens.
>> b.) It should be stored and delivered on next 0mq function call.
>> c.) It should be placed into the queue and delivered just after the last
>> message we've got before the disconnection.
> Another approach other than "notification" would be to provide a set
> of functions
> for querying and manipulating the state of a queue.  If application logic could
> see how many messages are queued and how long they have been queued,
> it could adjust how things are being handled.
> For example, if my application saw that messages with topic "foo" were not being
> recv'd by anyone, it could handle that situation.  As it currently
> stands, the application
> doesn't really have anyway of handling these types of things.

The current PUB/SUB pattern is based on the idea of never-ending stream 
of messages from the publisher. Individual subscribers join the stream, 
receive messages, then leave the stream. In such a pattern, the 
publisher is not interested in whether particular message is delivered 
to anyone - same way as TV transmitter is not interested in whether 
there's at least one TV set receiving the signal.

Maybe you have a different messaging pattern in mind? Describing the use 
case would be helpful.

>> 3. Each 0MQ socket handles N "connections". Supposing the connections are
>> anonymous the disconnection notification would simply state "one of the
>> connections was broken" - which is not of much use aside of keeping track of
>> number of opened connections. What's the use case here?
> I am coming from more of an RPC style of thinking so thinking in terms
> of messages
> is different for me.  In an RPC context, it is typically perfectly
> clear which connection
> was broken and when.  It probably doesn't make sense to track
> individual connections
> being broken.

In current messaging systems this problem is being solved by a "dead 
letter queue" - i.e. a queue where undeliverable messages are written. 
Maybe that would a good point to start thinking about disconnection issues?

>> 4. With multicast transports, sender is not even aware of all the receivers
>> (though receiver is aware of the senders) and thus it is certainly not aware
>> of receiver "disconnections". How does this fit into a bigger picture?
>> 5. If there's a middlebox on the path from sender to receiver (say
>> zmq_forwarder) this way A->B->C, when does the disconnection has to be
>> reported to A. If A-B connection breaks? What about B-C disconnection? It
>> prevents passage of messages in the same way as A-B disconnection does. How
>> should the event be passed back to A?
>> In overall, my feeling is that disconnection notifications are inherently
>> flawed concept (please, do argue with the point).
> I need to think about this more, but I do agree that it is difficult
> to see how an actual
> notification mechanism would work in a messaging context.  However, I think
> I have some use cases that are not covered fully by the current
> design.  I will post
> a new thread describing some of these usage cases.

Yes, please. The use cases is the most valuable contribution you can make!


More information about the zeromq-dev mailing list