[zeromq-dev] Handling disconnections; was:Questions about socket errors
Martin Sustrik
sustrik at 250bpm.com
Thu Feb 11 19:35:23 CET 2010
Hi Brian,
> Thanks, replies inline below...
>
>> Exactly. Specifying message type determines the algorithm used to handle
>> broken connections. In case of PUB/SUB the messages are simply dropped once
>> the queue overflows. In cacase of REQ/REP unaccessible connections are
>> skipped. Once there's no accessible connection and queue limits are reached,
>> send function will block.
>
> OK, this clarifies the behavior for the different types of sockets.
> But, isn't it a bit
> dangerous to have send block in when the queue fills up. I would
> think there needs to
> be someway for the application logic to learn about this and remedy
> the situation.
There is a way. Do a non-blocking send. If the queue is full, it'll
return EAGAIN.
>>> I would imagine that a socket type that round robin distributes to a
>>> set of endpoints, would just skip
>>> any endpoint that disconnects? What about reply/request queues or
>>> multicast?
>> REQ/REP doesn't work over multicast right now. I haven't seen a compelling
>> use case for the functionality by the way. If you have one, please do share
>> it.
>
> Sorry, I didn't mean REQ/REP over multicast. I don't currently see a
> use case for this.
>
>> 1. It should be made clear what 'disconnection' means. On networking level
>> there are no disconnections. There are only packets either getting through
>> or not getting through. Disconnection can mean various things:
>>
>> a.) I've sent a packet and haven't got ACK for N seconds.
>> b.) I've sent a message and haven't got ACK for N seconds.
>> c.) I've sent a message and the peer application haven't acknowledged that
>> it have processed the transaction for N seconds.
>> d.) There were no data received from the peer for N seconds (heartbeats).
>> etc.
>
> I am more thinking at the level of tcp, where the various socket calls
> can return a range
> of error codes that indicate something went wrong with the
> connection. Obviously
> zeromq is handling those errors codes underneath it all.
>
>> 2. When should the disconnection notification be delivered?
>>
>> a.) Immediately when it happens.
>> b.) It should be stored and delivered on next 0mq function call.
>> c.) It should be placed into the queue and delivered just after the last
>> message we've got before the disconnection.
>
> Another approach other than "notification" would be to provide a set
> of functions
> for querying and manipulating the state of a queue. If application logic could
> see how many messages are queued and how long they have been queued,
> it could adjust how things are being handled.
>
> For example, if my application saw that messages with topic "foo" were not being
> recv'd by anyone, it could handle that situation. As it currently
> stands, the application
> doesn't really have anyway of handling these types of things.
The current PUB/SUB pattern is based on the idea of never-ending stream
of messages from the publisher. Individual subscribers join the stream,
receive messages, then leave the stream. In such a pattern, the
publisher is not interested in whether particular message is delivered
to anyone - same way as TV transmitter is not interested in whether
there's at least one TV set receiving the signal.
Maybe you have a different messaging pattern in mind? Describing the use
case would be helpful.
>> 3. Each 0MQ socket handles N "connections". Supposing the connections are
>> anonymous the disconnection notification would simply state "one of the
>> connections was broken" - which is not of much use aside of keeping track of
>> number of opened connections. What's the use case here?
>
> I am coming from more of an RPC style of thinking so thinking in terms
> of messages
> is different for me. In an RPC context, it is typically perfectly
> clear which connection
> was broken and when. It probably doesn't make sense to track
> individual connections
> being broken.
In current messaging systems this problem is being solved by a "dead
letter queue" - i.e. a queue where undeliverable messages are written.
Maybe that would a good point to start thinking about disconnection issues?
>> 4. With multicast transports, sender is not even aware of all the receivers
>> (though receiver is aware of the senders) and thus it is certainly not aware
>> of receiver "disconnections". How does this fit into a bigger picture?
>
>
>> 5. If there's a middlebox on the path from sender to receiver (say
>> zmq_forwarder) this way A->B->C, when does the disconnection has to be
>> reported to A. If A-B connection breaks? What about B-C disconnection? It
>> prevents passage of messages in the same way as A-B disconnection does. How
>> should the event be passed back to A?
>>
>> In overall, my feeling is that disconnection notifications are inherently
>> flawed concept (please, do argue with the point).
>
> I need to think about this more, but I do agree that it is difficult
> to see how an actual
> notification mechanism would work in a messaging context. However, I think
> I have some use cases that are not covered fully by the current
> design. I will post
> a new thread describing some of these usage cases.
Yes, please. The use cases is the most valuable contribution you can make!
Martin
More information about the zeromq-dev
mailing list