[zeromq-dev] Load balancing algorithm

Matt Weinstein matt_weinstein at yahoo.com
Mon Jun 28 20:44:47 CEST 2010


We had a similar problem with a SCSI bus years ago with floppy drives  
(!) and hard disks hanging off it, and the floppy controller didn't  
release the bus until it was done (!)

I implemented an n-dimensional DDA(*) to balance the bus, so e.g. if  
the floppy took 10x a hard disk, it would only get the bus every tenth  
cycle.  This required measuring the duration of requests, etc., and  
handling empty work queues by decaying the error term.

Provided you can slap a metric on the requests, this approach could  
work.

Best,
Matt

(*) http://en.wikipedia.org/wiki/Digital_Differential_Analyzer_(graphics_algorithm)

HINT: you don't need to use floating point :-)

On Jun 28, 2010, at 12:08 PM, Brian Granger wrote:

> Hi,
>
> The current load balancing algorithm used by the XREQ socket is
> defined in lb.cpp.  Requests are simply round robin'ed through the
> active pipes:
>
> current = (current + 1) % active
>
> The problem with this approach is that it has a potential severe
> bottleneck.  Let's say every N request takes a very long time to
> process, but all others are quick to process.  Let's say there are A
> active clients to the XREQ sockets that will do the processing.
>
> If A==N (or they are integer multiples), then the same client will
> always get the request that takes a long time to request.  Here is the
> simple example with A=N=2:
>
> requests = [short, long, short, long, short, long, ...]
>
> client 1 will process = [short, short, short, ...]
> client 2 will process = [long, long, long, long,...]
>
> I am wondering if we can do better than this.  My idea is to pick a
> client at random, rather that walking through them sequentially.  Here
> is a patch, that I think might implement this:
>
> diff --git a/src/lb.cpp b/src/lb.cpp
> index ca93ba2..f88fa18 100644
> --- a/src/lb.cpp
> +++ b/src/lb.cpp
> @@ -22,6 +22,7 @@
> #include "lb.hpp"
> #include "pipe.hpp"
> #include "err.hpp"
> +#include <stdlib.h>
>
> zmq::lb_t::lb_t () :
>     active (0),
> @@ -90,7 +91,7 @@ int zmq::lb_t::send (zmq_msg_t *msg_, int flags_)
>     //  continue round-robinning (load balance).
>     if (!more) {
>         pipes [current]->flush ();
> -        current = (current + 1) % active;
> +        current = random() % active;
>     }
>
>     //  Detach the message from the data buffer.
>
> I don't fully understand how the active sockets and flushing works, so
> this might not be a full implementation.  But, I wanted to get
> feedback on this idea.  There are some issues that might appear
> though:
>
> * You probably want to make sure that current changes with each  
> iteration.
> * For small numbers of active clients, and small number of requests,
> statistical fluctuations could lead to unequal load balancing.
>
> Thoughts?
>
> Cheers,
>
> Brian
>
> -- 
> Brian E. Granger, Ph.D.
> Assistant Professor of Physics
> Cal Poly State University, San Luis Obispo
> bgranger at calpoly.edu
> ellisonbg at gmail.com
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev




More information about the zeromq-dev mailing list