[zeromq-dev] Feedback on new PATCH socket

Martin Sustrik sustrik at 250bpm.com
Mon May 9 07:18:43 CEST 2011

Hi Fabien,

>> This won't work. The peer can be a device rather than an endpoint.
>> In such case you should expect to get arbitrary number of responses
>> from a single connection.
> My bad...  The code should loop on the RCVMORE condition and send a
> full message. Also, take note that it only discard a socket from the
> wait queue once a END signal is received, forwarding all preceding
> messages.

I meant arbitrary number of messages rather than arbitrary number of 
message parts.

Imagine that the socket is connected to a device. The socket has no idea 
how many nodes are connected to the device. So it sends a request to the 
device which in turn forwards it to all the connected nodes. The replies 
from individual endpoints go back the same route to the original 
requester. So, if there are 10 nodes connected to the device you'll get 
10 replies on that connection. If there are 20 nodes, you'll get 20 replies.

Now consider that nodes can come and go at any time. There's no way to 
find out what's the total number of repliers in advance.

If what you meant was propagating an "END" message gradually through the 
topology, there's still problem of dead/misbehaving nodes. If a node 
receives a request and never sends an END message, it'll deadlock the 
whole topology.

> Letting the endpoint socket set the timeout required the socket to
> know quite information about the topology, like how many messages it
> should expect.

Why so? You simply wait till timeout and what you get till then is 
considered the full reply set. Any subsequent replies are discarded.

>  If too long also, it will cap the socket capacity to
> this period.  It also can be done quite easily using a poller,
> without any modification.
> By putting the timeout in the polling of the PATCH socket, this will
> allow the sockets to go to their maximum capacity, ending a call as
> soon as it terminates, returning an error if no answer were return
> after a reasonable amount of time.

To end before the timout you need to know how many replies you'll get, 
which is not the case.

There's always an optimisation option to periodically distribute the 
number of connected nodes at each branch of the graph to be used as an 
heuristic by the requester, with fallback to the timeout when the 
heuristic turns out to be incorrect. The nasty thing about that model is 
that repliers who send the reply in time may still get ignored because 
of broken heuristics.

A side note: Having the timout in the socket rather than on top of it 
allows for algorithm optimisation. The messages can for example contain 
TTL field saying "you can drop this request as the timout has expired 
and thus your reply will be ignored anyway".

> However, I still think that using the PATCH as is can be useful.
> Calls that required only a subset of the nodes to answer or that
> required an asynchronous pattern (like mongrel2 handlers) cannot be
> done with such strict collecting policy (one point for your proposal
> on this case ;) ).


More information about the zeromq-dev mailing list