[zeromq-dev] Handling network failures in parallel pipeline ventilator?

Andrew Hume andrew at research.att.com
Thu Aug 23 23:16:39 CEST 2012


repeat after me:

	a load-balancing fair-share message routing system is NOT a job scheduler.

it nearly can do a related thing, but its not.
there are several ways to do this; i'm sure teh guide covers a couple.
i would normally handle this one of two ways:

a) if tasks are expensive, then don't push tasks around. have workers ask for a task
	(e.g. using REQ/REP) one at a time
b) if tasks are inexpensive (and efficiency matters), then shovel requests via PUSH
	at the workers (who PULL) using a modest HWM. if one worker gets 1000 tasks
	and another gets 10 (because there were only 1010), who cares? the tasks are cheap.

in each case, when a task gets sent, it gets a timestamp and a timeout. workers PUSH back an
acknowledgement when they complete a task; the scheduler process marks it as done and
when a task times out, you schedule it again.
this should be enough to do what you describe.

admittedly, there is a single point of failure in the scheduler. but it is paid for by simplicity.
(and by frequent checkpointing, you can mitigate the effects of scheduler failure.)

	andrew

On Aug 23, 2012, at 2:01 PM, Joe Planisky wrote:

> I'm a little stumped about how to handle network failures in a system that uses PUSH/PULL sockets as in the Parallel Pipeline case in chapter 2 of The Guide.
> 
> As in the Guide, suppose my ventilator is pushing tasks to 3 workers.  It doesn't matter which task gets pushed to which worker, but it's very important that all tasks eventually get sent to a worker.  
> 
> Everything is working fine; tasks are being load balanced to workers, workers are doing their thing and sending the results on to a sink.  Now suppose there's a network failure between the ventilator and one of the workers. Suppose the ethernet cable to one of the worker machines is unplugged.
> 
> Based on what we've seen in practice, the ventilator socket will still attempt to push some number of tasks to the now disconnected worker before realizing there's a problem.  Tasks intended for that worker start backing up, presumably in ZMQ buffers and/or in buffers in the underlying OS (Ubuntu 10.04 in our case).  Eventually, the PUSH socket figures out that something is wrong and stops trying to send additional tasks to that worker. All new tasks are then load balanced to the remaining workers.  
> 
> However, the tasks that are queued up for the disconnected worker are stuck and are never sent anywhere unless or until the original worker comes back online.  If the original worker never comes back, those tasks never get executed. (If it does come back, it gets a burst of all the backed up tasks and the PUSH socket resumes load balancing new tasks to all 3 workers.)
> 
> We'd like to prevent this backup from happening or at least minimize the number of tasks that get stuck.  We've tried setting high water marks, send and receive timeouts, and send and receive buffer sizes in ZMQ to small values (e.g. 1) hoping that it would cause the PUSH socket to notice the problem sooner, but at best we still get several dozen task messages backed up before the socket notices the problem and stops trying.  (Our task messages are small, about 520 bytes each.)
> 
> If we have to, we can deal with the same task getting sent to more than one worker on an occasional basis, but we'd like to avoid that if possible.
> 
> We're using ZMQ 2.2.0, but are also investigating the 3.2.0 release candidate.  If it matters, we're accessing ZMQ with Java using the jzmq bindings.  The underlying OS is Ubuntu 10.04.
> 
> Any suggestions for how to deal with this?
> 
> --
> Joe
> 
> 
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev


------------------
Andrew Hume  (best -> Telework) +1 623-551-2845
andrew at research.att.com  (Work) +1 973-236-2014
AT&T Labs - Research; member of USENIX and LOPSA




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20120823/cc785af7/attachment.htm>


More information about the zeromq-dev mailing list