[zeromq-dev] High CPU load for simple forwaring reactor

Auer, Jens jens.auer at cgi.com
Fri May 8 11:04:32 CEST 2015


Hi,

I am prototyping an application with zeroMQ whose main functionality is to processes a data stream of ~30.000/s of 1k packets. Since the application also integrates other sources, e.g. sockets with lower bandwidth, timers etc., I thought a Reactor would be a good architecture, and the data would be broadcasted on a PUB socket. As a demo, I have created a small system with two applications:
1. A sender application sending 10.000 packets per second on a PUB socket (it sends 1000 packets every 100 milliseconds)
2. A reactor using zmq_poll to wait for data on a SUB
I've attached the source code for both programs. Basically, the reactor uses zmq_poll to wait for activity and then reads from the active socket in non-blocking mode until all available data is consumed. The messages are just forwarded to a PUB socket. A second socket is monitored, and when there is activity on this socket, the program is stopped.

When running three senders connecting to one reactor, the CPU load on my system (a VM, maybe that is important?) reaches 20% which seems quite high. I first suspected that this due to memory allocations/deallocations, because the program is not doing any processing. So I plugged tcmalloc in to see if the caching changes anything. The improvement is minimal, CPU load now reaches 18%.

For further analysis, I used gperftools CPU profiler to see where the time is spent. After running the reactor a little, the output consistently is:
Total: 384 samples
     301  78.4%  78.4%      301  78.4% zmq::clock_t::rdtsc
      67  17.4%  95.8%       67  17.4% __poll_nocancel
       4   1.0%  96.9%        4   1.0% __lll_lock_wait_private
       4   1.0%  97.9%        4   1.0% zmq::clock_t::rdtsc (inline)
       3   0.8%  98.7%        3   0.8% __write_nocancel
       1   0.3%  99.0%        1   0.3% 0x00007fff353ba998
       1   0.3%  99.2%        1   0.3% __GI_madvise
       1   0.3%  99.5%        1   0.3% __lll_unlock_wake_private
       1   0.3%  99.7%        1   0.3% __nanosleep_nocancel
       1   0.3% 100.0%        2   0.5% _int_free

with tcmalloc, the result is:
     613  81.6%  81.6%      613  81.6% zmq::clock_t::rdtsc
     126  16.8%  98.4%      126  16.8% __poll_nocancel
       2   0.3%  98.7%        2   0.3% 0x00007fff215c7998
       2   0.3%  98.9%        2   0.3% __GI_madvise
       2   0.3%  99.2%        2   0.3% __write_nocancel
       1   0.1%  99.3%        1   0.1% PackedCache::KeyMatch (inline)
       1   0.1%  99.5%        1   0.1% __nanosleep_nocancel
       1   0.1%  99.6%        1   0.1% _init
       1   0.1%  99.7%        4   0.5% tc_free
       1   0.1%  99.9%        1   0.1% zmq::clock_t::rdtsc (inline)

I am surprise that the hotspots is in __poll_nocancel, but what strikes me most is the 80% time spent in zmq::clock_t::rdtsc. Analyzing the call information, zmq::clock_t::rdtsc is called from zmq::socket_base_t::process::commands, which is called from zmq::poll, zmq::send and zmq::recv. I looked at the function, and it is an inline assembler instruction to get a CPU counter. This should be quite fast.

I am kind of lost with the output and would appreciate any help or hints how to optimize the logic to use less CPU time. As a short test, I created larger packets to reduce the number of calls to send/recv and poll. This reduces the load significantly, but I am really wondering why the hotspot is clock_t::rdstc().

Best wishes,
  Jens

--
Jens Auer | CGI | Software-Engineer
CGI (Germany) GmbH & Co. KG
Rheinstraße 95 | 64295 Darmstadt | Germany
T: +49 6151 36860 154
jens.auer at cgi.com
Unsere Pflichtangaben gemäß § 35a GmbHG / §§ 161, 125a HGB finden Sie unter de.cgi.com/pflichtangaben.

CONFIDENTIALITY NOTICE: Proprietary/Confidential information belonging to CGI Group Inc. and its affiliates may be contained in this message. If you are not a recipient indicated or intended in this message (or responsible for delivery of this message to such person), or you think for any reason that this message may have been addressed to you in error, you may not use or copy or deliver this message to anyone else. In such case, you should destroy this message and are asked to notify the sender by reply e-mail.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20150508/da8d4881/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: reactor.cpp
Type: text/x-c++src
Size: 3165 bytes
Desc: reactor.cpp
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20150508/da8d4881/attachment.cpp>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sender.cpp
Type: text/x-c++src
Size: 1256 bytes
Desc: sender.cpp
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20150508/da8d4881/attachment-0001.cpp>


More information about the zeromq-dev mailing list