[zeromq-dev] Parallel sort
Oliver Smith
oliver at kfs.org
Mon Aug 2 04:22:59 CEST 2010
So, I decided to try and figure out how to implement a parallel sort
with ZeroMQ message passing.
At this point, I have more fundamental issues than choosing a good
parallel sort algorithm so -- for now, forget that is the final goal.
Issue 1: The worker threads be created before any work is dispatched, so
they can be re-used.
- Multiple threads attempting to bind() or connect() ZMQ_REP on an
inproc:// socket fail with address in use or connection refused. (D'oh)
Issue 2: Ability to "join" pairs of work for divide and conquer.
- Yeah... I may just have been looking at the screen too long today, but
I can't figure an elegant (and scalable) way to "join" two completed
sub-tasks.
Approach 1 (I'm trying to do this in C++, but explaining in Python)
def divideAndConquer1(work):
if len(work) <= minsize:
return directWork(work)
splitAt = len(work) / 2
dispatch(divideAndConquer1, work[:splitAt]) # Dispatch right
sub-unit
divideAndConquer1(work[:splitAt]) # Process left sub-unit myself
... join ...
postProcess(work)
return work
Approach 2
def divideAndConquer2(work):
if len(work) <= minsize:
return directWork(work)
splitAt = len(work) / 2
dispatch(divideAndConquer2, work[:splitAt]) #
Dispatch both workloads...
dispatch(divideAndConquer2, work[splitAt:])
... join ...
postProcess(work)
return work
I also experimented with a third approach, where the top-level
dispatching routine starts from the bottom up, i.e. dispatching all the
"minsize" workloads for directWork, waiting for them all to complete,
and then combining those work-units into pairs and dispatching these
super-units to the level-above processing routine.
But I couldn't figure out how to have a single sender dispatch to a
multiplexed in-socket capable of sending back a reply; I could use a
pair of down/up stream sockets like I do in AsyncWorker, but that would
mean the function could only be used from the master thread (i.e it
would be thread-unsafe).
Appreciate any feedback or advice on any of the above methods, or if
someone has actually implemented a pipelined parallel sort (bitonic or
otherwise) that they don't mind sharing....
- Oliver
More information about the zeromq-dev
mailing list