[zeromq-dev] Weird behavior with zmq_proxy performance

Pieter Hintjens ph at imatix.com
Tue Nov 19 13:28:25 CET 2013


I'll put together a test case in C and see if I can reproduce the problem...

On Tue, Nov 19, 2013 at 1:13 PM, Bruno D. Rodrigues
<bruno.rodrigues at litux.org> wrote:
> as requested I’ve created a ticket and updated the branch with the latest
> code and a perf/README.txt explaining how to run it (basically the
> instructions below)
>
> https://github.com/zeromq/libzmq/issues/757
>
>
> On Nov 10, 2013, at 13:08, Bruno D. Rodrigues <bruno.rodrigues at litux.org>
> wrote:
>
> I’ve branched the code to add the proxy code for testing:
> https://github.com/davipt/libzmq/tree/fix-002-proxy_lat_thr
>
> This now allows me:
>
> 1. current PUSH/PULL end-to-end test:
>
> idavi:perf bruno$ ./local_thr tcp://127.0.0.1:5555 500 10000000 &
> local_thr bind-to=tcp://127.0.0.1:5555 message-size=500
> message-count=10000000 type=0 check=0 connect=0
>
> idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:5555 500 10000000 &
> remote_thr connect-to=tcp://127.0.0.1:5555 message-size=500
> message-count=10000000 type=0 check=0
>
> message size: 500 [B]
> message count: 10000000
> mean throughput: 1380100 [msg/s]
> mean throughput: 5520.400 [Mb/s]
>
> 2. PUB/SUB end-to-end test:
>
> idavi:perf bruno$ ./local_thr tcp://127.0.0.1:5555 500 10000000 1 &
> local_thr bind-to=tcp://127.0.0.1:5555 message-size=500
> message-count=10000000 type=1 check=0 connect=0
>
> idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:5555 500 10000000 1 &
> remote_thr connect-to=tcp://127.0.0.1:5555 message-size=500
> message-count=10000000 type=1 check=0
>
> message size: 500 [B]
> message count: 10000000
> mean throughput: 971666 [msg/s]
> mean throughput: 3886.664 [Mb/s]
>
> 3. same test via zmq_proxy, by switching local_lat from bind to connect:
>
> idavi:perf bruno$ ./proxy tcp://*:8881 tcp://*:8882 &
> Proxy type=PULL|PUSH in=tcp://*:8881 out=tcp://*:8882
>
> idavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 10000000 32 &
> local_thr bind-to=tcp://127.0.0.1:8882 message-size=500 message-count=100000
> type=32 check=0 connect=32
>
> idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:8881 500 10000000 &
> remote_thr connect-to=tcp://127.0.0.1:8881 message-size=500
> message-count=10000000 type=0 check=0
>
> message size: 500 [B]
> message count: 10000000
> mean throughput: 92974 [msg/s]
> mean throughput: 371.896 [Mb/s]
>
> 4. same test via proxy and PUB/SUB, including checking if every message
> arrives (*)
>
> idavi:perf bruno$ ./proxy tcp://*:8881 tcp://*:8882 1 &
> Proxy type=XSUB|XPUB in=tcp://*:8881 out=tcp://*:8882
>
> idavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 10000000 49 &
> local_thr bind-to=tcp://127.0.0.1:8882 message-size=500
> message-count=10000000 type=49 check=16 connect=32
>
> idavi:perf bruno$ ./remote_thr tcp://127.0.0.1:8881 500 10000000 17 &
> remote_thr connect-to=tcp://127.0.0.1:8881 message-size=500
> message-count=10000000 type=17 check=16
>
> message size: 500 [B]
> message count: 10000000
> mean throughput: 88721 [msg/s]
> mean throughput: 354.884 [Mb/s]
>
> (*) if check is enabled on the remote_thr, the message, if size>16, will
> contain a counter. On the local_thr it will then verify if the counter comes
> at the expected order and without loosing any message. Hence why the
> remote_thr needs to increase the HWM and sleep for one second in case of
> PUB/SUB.
>
>
> So, then again, what is happening with the zmq_proxy?
>
>
>
>
> On Nov 7, 2013, at 22:15, Bruno D. Rodrigues <bruno.rodrigues at litux.org>
> wrote:
>
> I’ve been testing a lot of combinations of ZeroMQ over Java, between the
> pure jeromq base and the jzmq JNI libzmq C code. Albeit my impression so far
> is that jeromq is way faster than the binding - not that the code isn’t
> great, but my feeling so far is that the JNI jump slows everything down - at
> a certain point I felt the need for a simple zmq_proxy network node and I
> was pretty sure that the C code must be faster than the jeromq. I have some
> ideas that can improve the jeromq proxy code, but it felt easier to just
> compile the zmq_proxy code from the book.
>
> Unfortunately something went completely wrong on my side so I need your help
> to understand what is happening here.
>
> Context:
> MacOSX Mavericks fully updated, MBPro i7 4x2 CPU 2.2Ghz 16GB
> libzmq from git head
> (same for jeromq and libzmq, albeit I’m using my own fork so I can send
> pulls back)
> my data are json lines that goes from about 100 bytes to some multi MB
> exceptions, but the average of those million messages is about 500bytes.
>
> Test 1: pure local_thr and remote_thr:
>
> iDavi:perf bruno$ ./local_thr tcp://127.0.0.1:8881 500 1000000 &
> iDavi:perf bruno$ time ./remote_thr tcp://127.0.0.1:8881 500 1000000 &
> real 0m0.732s
> user 0m0.516s
> sys 0m0.394s
> message size: 500 [B]
> message count: 1000000
> mean throughput: 1418029 [msg/s]
> mean throughput: 5672.116 [Mb/s]
>
> Test 2: change local_thr to perform connect instead of bind, and put a proxy
> in the middle.
> The proxy is the first C code example from the book, available here
> https://gist.github.com/davipt/7361477
> iDavi:c bruno$ gcc -o proxy proxy.c -I /usr/local/include/ -L
> /usr/local/lib/ -lzmq
> iDavi:c bruno$ ./proxy tcp://*:8881 tcp://*:8882 1
> Proxy type=PULL/PUSH in=tcp://*:8881 out=tcp://*:8882
>
> iDavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 1000000 &
> iDavi:perf bruno$ time ./remote_thr tcp://127.0.0.1:8881 500 1000000 &
> iDavi:perf bruno$ message size: 500 [B]
> message count: 1000000
> mean throughput: 74764 [msg/s]
> mean throughput: 299.056 [Mb/s]
>
> real 0m10.358s
> user 0m0.668s
> sys 0m0.508s
>
>
> Test3: use the jeromq equivalent of the proxy:
> https://gist.github.com/davipt/7361623
>
> iDavi:perf bruno$ ./local_thr tcp://127.0.0.1:8882 500 1000000 &
> [1] 15816
> iDavi:perf bruno$ time ./remote_thr tcp://127.0.0.1:8881 500 1000000 &
> [2] 15830
> iDavi:perf bruno$
> real 0m3.429s
> user 0m0.654s
> sys 0m0.509s
> message size: 500 [B]
> message count: 1000000
> mean throughput: 293532 [msg/s]
> mean throughput: 1174.128 [Mb/s]
>
> This performance coming out of Java is okish, it’s here just for comparison,
> and I’ll spend some time looking at it.
>
> The core question is the C proxy - why 10 times slower than the no-proxy
> version?
>
> One thing I noticed, by coincidence, is that on the upper side of the proxy,
> both with the C “producer” as well as the java one, tcpdump shows me
> consistently packets of 16332 (or the MTU size if using ethernet, 1438 I
> think). This value is consistent for the 4 combinations of producers and
> proxies (jeromq vs c).
>
> But on the other side of the proxy, the result is completely different. With
> the jeromq proxy, I see packets of 8192 bytes, but with the C code I see
> packets of either 509 or 1010. It feels like the proxy is sending the
> messages one by one. Again, this value is consistent with the PULL consumer
> after the proxy, being it C or java.
>
> So this is something on the proxy “backend” socket side of the zmq_proxy.
>
> Also, I see quite similar behavior with a PUB - [XSUB+Proxy+XPUB] - SUB
> version.
>
> What do I need to tweak on the proxy.c ?
>
> Thanks in advance
>
>
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>



-- 
-
Pieter Hintjens
CEO of iMatix.com
Founder of ZeroMQ community
blog: http://hintjens.com



More information about the zeromq-dev mailing list