[zeromq-dev] 0MQ/2.0 Intermittent missing replies in REQ/REP

Ben Dyer Ben.Dyer at taguchimail.com
Wed Nov 11 01:54:01 CET 2009


Martin,

Absolutely -- I've sent you a GitHub pull request for the change.

Regards,
Ben

On 11/11/2009, at 09:53 , Martin Sustrik wrote:

>
> Hi Ben,
>
> You are definitely right! What a stupid bug :(
>
> Are you OK to submit the fix under MIT license?
>
> Thanks.
> Martin
>
> On 10/11/2009, "Ben Dyer" <Ben.Dyer at taguchimail.com> wrote:
>
>> Hi,
>>
>> In testing a REQ/REP setup with multiple requesters connected to one
>> server, I've noticed that occasionally the final requester never
>> receives a reply, even though the server application is sending them
>> immediately. Aside from that single requester not receiving its  
>> reply,
>> everything else continues to function normally and the replier stays
>> active and handles other requests correctly.
>>
>> This only seems to happen under heavy load (involving many concurrent
>> requests from multiple sources), and using tcpdump I've determined
>> that the reply isn't actually being sent by the replier (at least not
>> to the correct requester).
>>
>> I haven't been able to create a setup which reproduces the issue
>> consistently outside of our application -- the problem is also
>> dependent on system load and possibly other factors.
>>
>> However, reviewing src/rep.cpp I noticed the following code in
>> rep_t::xrecv:
>>
>> //  Round-robin over the pipes to get next message.
>> for (int count = active; count != 0; count--) {
>>    bool fetched = in_pipes [current]->read (msg_);
>>    current++;
>>    if (current >= active)
>>        current = 0;
>>    if (fetched) {
>>        reply_pipe = out_pipes [current];
>>        waiting_for_reply = true;
>>        return 0;
>>    }
>> }
>>
>> This appears to set reply_pipe incorrectly in the event that current
>>> = active, so if there are multiple active pipes and a request is
>> received from the last, the reply to that request will be delivered  
>> to
>> the out_pipe for the first in_pipe, *not* the out_pipe matching the
>> in_pipe from which the request was read. I believe this is causing  
>> the
>> issue I'm seeing, but have not yet been able to prove it  
>> conclusively.
>>
>> In any event, changing that code to:
>>
>> //  Round-robin over the pipes to get next message.
>> for (int count = active; count != 0; count--) {
>>    bool fetched = in_pipes [current]->read (msg_);
>>    if (fetched) {
>>        reply_pipe = out_pipes [current];
>>        waiting_for_reply = true;
>>    }
>>    current++;
>>    if (current >= active)
>>        current = 0;
>>    if (fetched)
>>        return 0;
>> }
>>
>> fixes the problem while preserving the round-robin ordering.
>>
>> Regards,
>> Ben
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev




More information about the zeromq-dev mailing list