[zeromq-dev] zmq crash when using PGM

纪明 jiming at yafco.com
Wed Aug 15 02:55:01 CEST 2018



在 2018-08-14 18:40, 纪明 写道:
>
>
>
> 在 2018-08-14 17:49, Luca Boccassi 写道:
>> On Mon, 2018-08-13 at 14:52 +0800, 纪明 wrote:
>>> Hi all:
>>>
>>>       We are using ZMQ to do some multicast work. The service keep
>>> crashing, and we found it has something to do with pgm_receiver.
>>>
>>>       Specifically, there is a function called
>>> zmq::pgm_receiver_t::restart_input(), when it receives some data, it
>>> calls decoder to decode the message. On line 132, it checks if the
>>> message size is greater than zero. If yes, it will call
>>> process_input()
>>> function to decode the message. However, when insize is greater than
>>> zero, inpos could point to null. When this happens, zmq crashes when
>>> calling memcpy to copy something to the memory that inpos points to.
>>> This actually looks like a threading issue to us.
>>>
>>>       We really appreciate if anyone familiar with this zmq could
>>> point
>>> out a solution to this. We are using zmq in a real time environment,
>>> occassional message drop is more acceptable than crashing the
>>> service.
>>> We tried to change the source code a little bit, from "if (insize >
>>> 0)"
>>> to "if (insize > 0 && inpos)". It caused other problem.
>>>
>>> Thanks a lot in advance.
>>> Ming
>> Are you using a socket from multiple threads by any chance?
One thing we want to check is if pgm_receiver_t::in_event() and 
pgm_receiver_t::restart_input() could be called simutaneously? If yes, 
they both change values of insize and inpos, and this will cause problem.

> No, we are only using socket with the same ip in one thread. We 
> suspect there is threading issue inside zmq that causes inpos to 
> become null magically.  We did a dirty fix on process_input function, 
> and the change seems to save our system from crashing. We are worrying 
> if a message could be processed partially now. We will be in trouble 
> in that situation too. The change we made is:
>
> int zmq::pgm_receiver_t::process_input (v1_decoder_t *decoder)
> {
>     zmq_assert (session != NULL);
>
>        // Change that seems to prevent crashing
>         const void* pTmp = static_cast<const void*>(inpos);
>         if (pTmp == nullptr) {
>                 return -1;
>         }
>         else {
>         while (insize > 0) {
>                 size_t n = 0;
>                 int rc = decoder->decode (inpos, insize, n);
>                 if (rc == -1)
>                 return -1;
>                 inpos += n;
>                 insize -= n;
>                 if (rc == 0)
>                 break;
>                 rc = session->push_msg (decoder->msg ());
>                 if (rc == -1) {
>                 errno_assert (errno == EAGAIN);
>                 return -1;
>                 }
>         }
>         }
>     return 0;
> }
>
>>
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20180815/5d3ea5c8/attachment.html>


More information about the zeromq-dev mailing list