[zeromq-dev] Proposal: timeout based overload protection zeromq recv queues

Thijs Terlouw thijsterlouw at gmail.com
Thu Mar 10 15:10:58 CET 2011


I am considering writing a ZeroMQ patch to add extra overload protection to
ZeroMQ. I believe under exactly the right circumstances, an application
based on ZeroMQ will be unable to recover from a full (receive) queue if the
time it takes to handle the requests is close to the timeout in the client.

Example: request-reply and due to some problem, one of the requests might
take very long to process, and other requests enqueue behind it. Now we
start processing the head of the queue, but by the time we process this
request, the item is already timed out. The processing time we just wasted,
causes the next item also to be too late. And then we process the next one,
also too late etc. There must be a professional term for it, but I cannot
think of it :) Snowball effect?

In the servers we normally write, we add a timestamp to the items when they
are added to the queue. When the workers pick up items from the queue, they
first check the timestamp to make sure it has not expired yet. If expired,
we discard this item and process the next one. This way we can cleanup the
queue, to make sure we process some items that are worth processing (not
timed out).

ZMQ_SWAP puts those messages on disk (so they are not lost) and ZMQ_HWM
makes sure the queue doesn't grow too large. But there is no protection to
remove requests that are "too old". I would propose to add a ZMQ_QUEUE_TTL
(ms) item to a socket. If an item is dequeued that is older than the
ZMQ_QUEUE_TTL, it is immediately discarded. I'm thinking about the queue
that must exist before the application calls zmq_recv(). Before I implement
that, I would like to know your opinions though.

Putting the timestamp in the message at the application level is no option I
believe, because we're interested in the time it spends in the queue. I also
realize that with several devices in the middle, a simple approach would
discard the time that was spent in the previous queues. For my use case that
would be no problem though. Another approach could be to immediately drain
the queue in the app and create an app-level queue, but I think that's
duplicated work. Any other ideas?

-- 
Thijs Terlouw,
Shenzhen, China
http://www.startinchina.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20110310/7d60a153/attachment.htm>


More information about the zeromq-dev mailing list