[zeromq-dev] Multiplexing on top of TCP and subports

Martin Sustrik sustrik at 250bpm.com
Thu Oct 6 13:53:31 CEST 2011


On 10/06/2011 01:34 PM, Ian Barber wrote:
> On Thu, Oct 6, 2011 at 12:05 PM, Martin Sustrik<sustrik at 250bpm.com>  wrote:

>>> 1. How do we communicate the "agreed upon message rate" throughout the
>>> distribution tree?

>>> 2. How do we notify the applications that they are not adhering the
>>> message rate contract?

>> 3. The network itself takes part in the contract, ie. it should be able to
>> transfer the messages at the agreed upon rate. Who are we going to notify if
>> it's not keeping up with the rate? Network admin presumably... But how
>> exactly?

> For point 1, that depends on the question of where the message rate
> comes from - do you have the rate controlled at the publisher (as with
> PGM i guess), with the clients responsibility to keep up? That results
> in something pretty similar to the current setup, but with the
> addition of some sort of 'expected rate' message that could be
> propagated to leaves of the distribution tree. Or do you have clients
> advertise the rates at which they consume, and push that
> responsibility back up? That was roughly where I was going with my
> initial thought, but it is effectively just setting an window size
> which would then need a flow of what are basically ACKs back up for
> every message to add sending credit, which seems like a lot of
> needless traffic even if you aggregate at each node.

My feeling about this is that in any distribution tree there is a single 
centralised authority to declare the rate: the publisher.

This obviously works only with true trees. If there are multiple 
publishers involved you have no guarantee about the rate.

That's one of the more profound reasons why I've proposed tha PUB/SUB 
specification should limit itself only to distribution trees in the SP 
working group. See point 3 in the following message:

http://groups.google.com/group/sp-discuss-group/msg/06d6d7d26e7694c8

> For point 2, I think the  behavior of blocking on HWM would be better
> in a 'rate exceeded' cases for publishers. For consumers, I'm not sure
> how you could signal that they were consuming at too slow a rate,
> unless you had explicit HWM hit messages - which itself gets back to
> the problems that we discussed back at the unconf in brussels.

Ack. "Rate exceeded" error on the publisher would be nice.

On the consumer you can identify the problem if you encounter missing 
messages. However, it doesn't necessarily mean the problem is with you 
being to slow to consume. The messages may be lost somewhere up the 
tree. More thinking to be done...

> For point 3, the basic problem doesn't seem dissimilar to the TCP
> congestion issue - you agree a maximum rate for flow control (window
> size, which is really receiver controlled), but then actually have to
> scale the window size up to that maximum during the communication in
> order to test what the network can actually handle - if there's packet
> loss, you scale back. That has the advantage that the network is
> treated as a dumb resource - which would probably be the right general
> behavior for PUB/SUB as well. This would need a method of adjusting
> the rate throughout the distribution tree dynamically though!
>
> I guess, if you had a separate nack channel (as in some of Pieter's
> guide patterns), you could use somewhat similar logic:
>
> Publishers have a publishing channel, a NACK channel, and a predefined
> minimum rate, maximum rate, a rate increment, and clock interval
> Publishers start publishing on the minimum rate
> After each clock interval passes, publishers double their rate, until
> the rate reaches maximum
>
> Consumer read messages
> New consumers read from the first message they receive - they do not
> request old messages with the NACK channel
> If a consumer does not get contiguous sequence numbers in message, it
> sends a NACK to the publisher
>
> When the publisher receives a NACK, it halves the message rate
> After a clock interval with no NACK, having previously received one,
> it increases the rate by one rate increment per clock interval
>
> NACKs could be sent up the chain to allow centrally adjusting rate.
>
> This could also be built easily on top of existing zmq pubsub.

Yes. But think of the slow/dead/malevolent consumer issue: It will 
eventually stop the whole distribution tree by gradually decreasing the 
message rate until it grinds to a halt.

As the size of the distribution tree increases, the downtime of the 
system is going to increase as well until at some point it will reach 100%.

Martin



More information about the zeromq-dev mailing list