[zeromq-dev] Inefficient TCP connection for my PUB-SUB zmq communication

Jim Melton jim at melton.space
Sat Mar 27 05:00:57 CET 2021

Small TCP packets will never achieve maximum throughput. This is independent of ZMQ. Each TCP packet requires a synchronous round-trip.

For a 20 Gbps network, you need a larger MTU to achieve close to theoretical bandwidth, and each packet needs to be close to MTU. Jumbo MTU is typically 9000 bytes. The TCP ACK packets will kill your throughput, though.
Jim Melton
(303) 829-0447
jim at melton.space

> On Mar 26, 2021, at 4:17 PM, Francesco <francesco.montorsi at gmail.com> wrote:
> Hi all,
> I'm using ZMQ in a product that moves a lot of data using TCP as transport and PUB-SUB as communication pattern. "A lot" here means around 1Gbps. The software is actually a mono-directional chain of small components each linked to the previous with a SUB socket (to receive data) and a PUB socket (to send data to next stage).
> I'm debugging an issue with one of these components receiving 1.1Gbps from its SUB socket and sending out 1.1Gbps on its PUB socket (no wonder the two numbers match since the component does not aggregation whatsoever). 
> The "problem" is that we are currently using 16 ZMQ background threads to move a total of 2.2Gbps for that software component (note the physical links can carry up to 20Gbps so we're far from saturation of the link). IIRC the "golden rule" for sizing number of ZMQ background threads is 1Gbps = 1 thread.
> As you can see we're very far from this golden rule, and that's what I'm trying to debug.
> The ZMQ background threads have a CPU usage ranging from 98% to 80%. 
> Using "strace" I see that most of the time for these threads is spent in the "sendto" syscall. 
> So I started digging on the quality of the TX side of the TCP connection, recording a short trace of the traffic outgoing from the software component.
> Analyzing the traffic with wireshark it turns out that the TCP packets for the PUB connection are pretty small: 
> * 50% of them are 66B long; these are the TCP ACK packets (incoming)
> * 21% of them are in the range 160B-320B 
> * 18% in the range 320B-640B
> * 5% in range 640B-1280B
> * just 3% reach the MTU equal to 1500B
> * [there are a <1% fraction that also exceed the MTU=1500B of the link, which I'm not sure how is possible]
> My belief is that having a fewer number of packets, all close to the MTU of the link should greatly improve the performances. Would you agree with that?
> Is there any configuration I can apply on the PUB socket to force the Linux TCP stack to generate fewer but larger TCP segments on the wire?
> Thanks for any hint,
> Francesco
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20210326/e8a28bae/attachment.htm>

More information about the zeromq-dev mailing list