[zeromq-dev] zeromq performance

Charles Remes lists at chuckremes.com
Mon Jun 9 15:18:28 CEST 2014


Yes, the default is one I/O thread. This assumes you are using IPC and/or TCP for transport. If you use INPROC then I think it skips the creation of any I/O thread.

So, to duplicate your test using the ZMQ_AFFINITY socket option, you would want to specify that all endpoints are using the same CPU.

cr

On Jun 7, 2014, at 12:00 PM, Paul Phillips <paul at marketgridsystems.com> wrote:

> I can look at this.  However, at this point, I am calling zmq_ctx_new() and not setting any options, so I assume that for each of my processes I only have one zmq I/O thread for all its sockets and connections.  Is that correct?
> 
> Regards, Paul Phillips
> Director, MarketGrid Systems Pty Ltd
> t: +61 419 048 874
> e: paul at marketgridsystems.com
> 
> 
> 
> On 8 Jun 2014, at 2:55 am, Charles Remes <lists at chuckremes.com> wrote:
> 
>> This is a very interesting result. It’s probably worthwhile to add this to the FAQ.
>> 
>> It would be interesting to test this scenario with the ZMQ_AFFINITY option for zmq_setsockopt. If you have a chance, could you let us know if setting the I/O thread affinity results in a similar performance difference?
>> 
>> cr
>> 
>> On Jun 7, 2014, at 7:44 AM, Paul Phillips <paul at marketgridsystems.com> wrote:
>> 
>>> I have tracked down the source of this “problem”.  It turns out that the running multiple local-remote sets is not the key, it is actually related to what processor core things are running on.  If the local_lat and the remote_lat run on the same core then the comms is very fast (whether ipc or tcp).  If they are on separate cores, the comms is slower.  When I ran multiple sets, it just turned out that for some reason, the second local and remote set would be on the same core.  I can replicate the whole thing using taskset to force local_lat and remote_lat either to different or the same cores.
>>> 
>>> I have pasted my results from two runs, the first using separate cores and the second using the same core and the results are very different - 25 micros latency vs 7 micros latency (I have printed a couple of extra things at the end and renamed the executables to xlocal_lat and xremote_lat but they are the same code from the perf directory in the distribution of 4.0.4).
>>> 
>>> (MarketGrid)[Paul at CentOS65-Dev tmp]$ taskset -c 0 ./xlocal_lat ipc://testport2 100 100000 &
>>> [1] 3566
>>> (MarketGrid)[Paul at CentOS65-Dev tmp]$ taskset -c 1 ./xremote_lat ipc://testport2 100 100000 &
>>> [2] 3578
>>> (MarketGrid)[Paul at CentOS65-Dev tmp]$ message size: 100 [B]
>>> roundtrip count: 100000
>>> average latency: 25.421 [us]
>>> elapssed: 5084170
>>> throughput: 19668
>>> 
>>> [1]-  Done                    taskset -c 0 ./xlocal_lat ipc://testport2 100 100000
>>> [2]+  Done                    taskset -c 1 ./xremote_lat ipc://testport2 100 100000
>>> (MarketGrid)[Paul at CentOS65-Dev tmp]$ 
>>> (MarketGrid)[Paul at CentOS65-Dev tmp]$ 
>>> (MarketGrid)[Paul at CentOS65-Dev tmp]$ taskset -c 0 ./xlocal_lat ipc://testport2 100 100000 &
>>> [1] 3581
>>> (MarketGrid)[Paul at CentOS65-Dev tmp]$ taskset -c 0 ./xremote_lat ipc://testport2 100 100000 &
>>> [2] 3584
>>> (MarketGrid)[Paul at CentOS65-Dev tmp]$ message size: 100 [B]
>>> roundtrip count: 100000
>>> average latency: 7.016 [us]
>>> elapssed: 1403130
>>> throughput: 71269
>>> 
>>> [1]-  Done                    taskset -c 0 ./xlocal_lat ipc://testport2 100 100000
>>> [2]+  Done                    taskset -c 0 ./xremote_lat ipc://testport2 100 100000
>>> (MarketGrid)[Paul at CentOS65-Dev tmp]$ 
>>> 
>>> Regards, Paul Phillips
>>> Director, MarketGrid Systems Pty Ltd
>>> t: +61 419 048 874
>>> e: paul at marketgridsystems.com
>>> 
>>> 
>>> 
>>> On 7 Jun 2014, at 6:17 pm, Pieter Hintjens <ph at imatix.com> wrote:
>>> 
>>>> How many messages are you sending? There will be a start-up cost that
>>>> can be disproportionate if you send only a few messages.
>>>> 
>>>> On Sat, Jun 7, 2014 at 6:54 AM, Paul Phillips
>>>> <paul at marketgridsystems.com> wrote:
>>>>> Hi.  I have an interesting scenario when testing zeromq 4.0.4 on CentOS 6.5.
>>>>> When I run local_lat and remote_lat using ipc, I get a latency of around 30
>>>>> micros.  However, if I run them once in the background with a large round
>>>>> trips setting (so they keep running for a long time) and then run a second
>>>>> set, the second set always returns a latency of around 7 micros.  Basically,
>>>>> once I have one lot of stuff running in the background, subsequent stuff
>>>>> always seems to run much faster.  Is there any known explanation for this
>>>>> behaviour?
>>>>> 
>>>>> Regards, Paul Phillips
>>>>> 
>>>>> Director, MarketGrid Systems Pty Ltd
>>>>> t: +61 419 048 874
>>>>> e: paul at marketgridsystems.com
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> _______________________________________________
>>>>> zeromq-dev mailing list
>>>>> zeromq-dev at lists.zeromq.org
>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>> 
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> zeromq-dev at lists.zeromq.org
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> 
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>> 
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> 
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20140609/cdfcb452/attachment.htm>


More information about the zeromq-dev mailing list