[zeromq-dev] Efficiency for very few short messages

dan smith dan25311 at gmail.com
Mon Jan 21 03:21:04 CET 2013


One of the examples: to solve the equation 8000 times in serial mode takes
5976036 microseconds.
To solve 8000 equations in parallel mode takes 1421224 microseconds. I
distribute the job for each threads equally, the speedup is good, ~4.

If I solve the same equation 800 times, the numbers are 622463 micros and
202536micros, the speedup decreased to ~3.

80 times: 64614micros and 134429micros, the serial is already faster.

Going down to 8: 6345 and 328286...

I used zmq_stopwatch_start/stop.

Context setup and sockets setup are not included, I create them before
launching the parallel part so they do not eat time.

The very same code runs in serial mode and in a thread. It is a direct
equation solving from LAPACK (dgbtrf).

In parallel mode the code is just creates a C++ instance, gets the data
from a pool so there is no even memory allocation. I send the pointer to a
thread which already in a pool and waiting for a message. When it comes,
runs the solver based on the pointer.

To me this thing reminds for the inproc_lat tests when we have low
roundtrip_count with the high latency but I am not sure, I have no idea
what can be the reason.

The time for sending/receiving a message is about 1 microsec or less I
think if I send large number if messages, I tested that.


On Sun, Jan 20, 2013 at 7:25 PM, Claudio Carbone <erupter at libero.it> wrote:

> Hi Dan.
> What is the time it takes to solve a single equation?
> it all depends on how you coded your app: there is a fixed calculation
> that pertains only to the multi threaded version, fixed means that it
> doesn't scale.
> So when the iterations are so many, your fixed amount of time doesn't
> affect the total length; when your iterations are few, this time surpasses
> and overtakes the actual calculation time.
> Have you measured how long it takes to get all the zmq context and sockets
> setup? How much it takes to get ready to churn numbers?
> Maybe it's that part that comes down as the most relevant when you process
> few equations.
> Claudio
> -- Sent from my ParanoidAndroid Galaxy Nexus with K-9 Mail.
> dan smith <dan25311 at gmail.com> wrote:
>> Dear All,
>> I have got a multi threading application. It uses a pool of threads. Each
>> thread in the pool communicates with the main thread via ZMQ_PAIR sockets,
>> one pair for each thread. The length of the messages is 8 bytes (pointer to
>> a     C++ object). On a quad core machine I use 8 threads. Each thread in
>> the pool solves a linear equation system (a small one, still the solution
>> time is much larger than the time needed for the communication between the
>> main thread and a worker).
>> The speedup is perfect...if large number of equations are solved.
>> However, the requirement is to solve just 8 equations parallel, one in each
>> thread at the same time. If I decrease the number of equations below ~100,
>> the serial solution becomes much faster than the multi core solution.
>> Why is that and how this problem can be solved? How can a multi core
>> application be made efficient for small problems too?
>> Thank you very much in advance,
>> Danny
>> ------------------------------
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20130120/7848d4a4/attachment.htm>

More information about the zeromq-dev mailing list