[zeromq-dev] Why ZMQ drop messages?

Emmanuel TAUREL taurel at esrf.fr
Fri Nov 25 15:37:51 CET 2011


Hello everybody,

I have made further studies on the following point that I have recently 
sent on this list.

On 23/11/2011 17:16, Emmanuel TAUREL wrote:
> Hello all,
>
> I am using ZMQ 3.0.x on linux boxes with  the PUB/SUB pattern.
> I have only one subscriber which is very slow. It needs 1 second every
> time a message is read.
> I have a HWM on the publisher side set to 10.
>
> In my message, I have a counter which is incremented for each message.
> My messages are relatively small (150 bytes)
> I have a print of date each time the publisher sends a message
> (gettimeofday)
> I also have on the same host where the publisher is running a wireshark
> tool which captures network packets.
>
> With wireshark, I see that ZMQ drops messages number 11 to 39. I don't
> understand why.
> All the previous messages (number 1 to 10) have been sent on the network
> because I see them on wireshark
> The time reported by wireshark is coherent with the time printed by the
> publisher.
>
> Message 8 sent by publisher at xxx623,205547
> Message 8 seen by wireshark at xxx623,205556
>
> Message 9 sent by publisher at xxx623,205575
> Message 9 seen by wireshark at xxx623,205584
>
> Message 10 sent by publisher at xxx623,205603
> Message 10 seen by wireshark at xxx623,205611
>
> Message 11 sent by publisher at xxx623,205629
> Message 12 sent by publisher at xxx623,205654
> Message 13 sent by publisher at xxx623,205704
> Message 14 sent by publisher at xxx623,205729
> ....
>
> These messages are not seen by wireshark because I guess ZMQ took the
> decision to drop them.
> But why it took that decision? I don't think there are messages in the
> queue because I have seen them on
> the wire!
>
> Is there something I have missed?
> Any explanations are welcome
>
> Thank's in advance
>
> Emmanuel
>
 From what I have understood, the problem is the following.
The pipe  used for communication between my application subscriber 
thread and the zmq I/O thread is effectively marked as
full (msgs_written - peers_msgs_read == uint64_t (hwm) in pipe.cpp file 
check_write method) even if I have seen my messages on the wire (shown 
by wireshark).
The I/O thread as effectively sent the messages on the wire and it has 
sent the "activate_write" command to the pipe. When the subscriber 
thread sends a message, it process the pipe command list (method 
socket_base_t::process_commands()) but the "activate_write" commands are 
(they are several) not executed immediately.
This is due to the code in this socket_base_t::process_commands() method 
just before the loop processing the
commands. There is some code to optimize commands processing which 
takes  the decision to return from this
method before the commands are processed. If I comment out this part of 
the code, things works much better and I do not notice dropped messages.

What do you think?

Thank's for your help

Emmanuel




More information about the zeromq-dev mailing list