[zeromq-dev] Dropped multipart messages?

Jim Melton jim at melton.space
Mon May 26 18:50:38 CEST 2025


Nick,

I think it’s best to think of multipart messages as a serialization technique (like fields in Protocol Buffers) rather than as separate messages. Because they are handled atomically at the API layer, a multipart message is an all-or-nothing thing. So rather than worrying about multipart or not in thinking about your design, just consider “messages” as each quantum of communication between your nodes (which may be a ZMQ multipart message).

Your testing demonstrates expected behavior to me. With a very low high-water mark and a high data rate, you should expect to hit the limit. As the Socket API <https://zeromq.org/socket-api/#high-water-mark> says, 

> If this limit has been reached the socket enters an exceptional state and depending on the socket type, ZeroMQ will take appropriate action such as blocking or dropping sent messages. Refer to the individual socket descriptions below for details on the exact action taken for each socket type.


In your case, the “appropriate action” was dropping a message.

Stephen’s explanation of packet management was very good, and should help inform your design. You either need appropriately sized buffers, or some sort of flow-control technique.

—
Pragmatics must take precedence over elegance
For Nature is not impressed

Jim Melton, 303-829-0447
http://blogs.melton.space/pharisee/

> On May 23, 2025, at 21:39, Stephen Gray <stephen at forgottensound.net> wrote:
> 
> Hi. I'm reading this on the context of having worked with deeply embedded message routing in closed but WAN packet systems (as opposed to TDM) in subsea.
> 
> What I see in your problem description are the essential elements of a problem I encountered before. 
> Our routing nodes were FPGA based and design constrained by ultra low power limits.
> Where in the previous TDM solution, all packets had an allotted time, the temporal packet arrival rate and internal was predictable and stable, when we moved to a packet solution we then had random peak traffic rates to contend  with even though the average traffic rate had not increased.
> The solution now had to have larger buffers in each node. In order not to lose any packets, which was a requirement of the system, we needed large enough buffers to also handle the statistically unlikely highest peak rates, or else we observed actual packet loss.
> We looked at mechanisms which would evaluate traffic arrival density, and take local or back propagated action to previous nodes to throttle back, but we found that there would still be rare cases where we would detect buffer overflow, but then we would have lost at least one  packet already.
> 
> It's a conceptual view .. but by experimentation we ended up with handling the average traffic rate by throttling gradually when the peak arrival rate exceeded say 80% capacity, essentially relying on buffer space in preceding routing nodes, and handling the statistically rare losses with a retry mechanism.
> 
> As I seen it, if there is a requirement for zero data loss, then a a system of nodes with asynchronous send patterns need much much more buffer space. 
> This is where routing mechanisms like ATM had advantages.
> 
> We had got too far in development to go back to the TDM solution .. but the optimal solution would have been TDM for the isochronous traffic and a residual buffer system for asynchronous packets in the remaining intervals.
> 
> Hope this perspective .. while a little general .. may help.
> 
> Best Regards
> Stephen
> 
> 
> 23. mai 2025 kl. 19:26 skrev Nicholas Long via zeromq-dev <zeromq-dev at lists.zeromq.org>:
> 
> 
> Hi, 
> 
> I am new to zmq, so this may be a bit of a dumb question.
> 
> I am using a dealer-router, where the dealer is sending a-lot of multipart messages to the router. It seems like if I have some weird processing pattern on the router, where I do not service the messages correctly, then I get dropped multipart messages.
> 
> This is admittedly intermittent, and goes away if I increase the receive high water mark on the router. But I am able to reliably reproduce if I send multipart messages (a group of 3), and have the receive high water mark of 10.
> 
> So it seems like there is some case where the router's zmq will silently drop multipart messages when there is not enough room for the whole message in the internal buffers. This is counter to my expectation that when there is not enough on zmq internal buffers, the copy from the socket to the zmq internal buffer would block, this blocking then propagates down back across the tcp socket, and then causes the dealer's send function to block.
> 
> Am I missing something?
> 
> Thanks,
> Nick
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
> 
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20250526/d131a962/attachment.htm>


More information about the zeromq-dev mailing list