[zeromq-dev] zmq crash when using PGM
纪明
jiming at yafco.com
Wed Aug 15 02:55:01 CEST 2018
在 2018-08-14 18:40, 纪明 写道:
>
>
>
> 在 2018-08-14 17:49, Luca Boccassi 写道:
>> On Mon, 2018-08-13 at 14:52 +0800, 纪明 wrote:
>>> Hi all:
>>>
>>> We are using ZMQ to do some multicast work. The service keep
>>> crashing, and we found it has something to do with pgm_receiver.
>>>
>>> Specifically, there is a function called
>>> zmq::pgm_receiver_t::restart_input(), when it receives some data, it
>>> calls decoder to decode the message. On line 132, it checks if the
>>> message size is greater than zero. If yes, it will call
>>> process_input()
>>> function to decode the message. However, when insize is greater than
>>> zero, inpos could point to null. When this happens, zmq crashes when
>>> calling memcpy to copy something to the memory that inpos points to.
>>> This actually looks like a threading issue to us.
>>>
>>> We really appreciate if anyone familiar with this zmq could
>>> point
>>> out a solution to this. We are using zmq in a real time environment,
>>> occassional message drop is more acceptable than crashing the
>>> service.
>>> We tried to change the source code a little bit, from "if (insize >
>>> 0)"
>>> to "if (insize > 0 && inpos)". It caused other problem.
>>>
>>> Thanks a lot in advance.
>>> Ming
>> Are you using a socket from multiple threads by any chance?
One thing we want to check is if pgm_receiver_t::in_event() and
pgm_receiver_t::restart_input() could be called simutaneously? If yes,
they both change values of insize and inpos, and this will cause problem.
> No, we are only using socket with the same ip in one thread. We
> suspect there is threading issue inside zmq that causes inpos to
> become null magically. We did a dirty fix on process_input function,
> and the change seems to save our system from crashing. We are worrying
> if a message could be processed partially now. We will be in trouble
> in that situation too. The change we made is:
>
> int zmq::pgm_receiver_t::process_input (v1_decoder_t *decoder)
> {
> zmq_assert (session != NULL);
>
> // Change that seems to prevent crashing
> const void* pTmp = static_cast<const void*>(inpos);
> if (pTmp == nullptr) {
> return -1;
> }
> else {
> while (insize > 0) {
> size_t n = 0;
> int rc = decoder->decode (inpos, insize, n);
> if (rc == -1)
> return -1;
> inpos += n;
> insize -= n;
> if (rc == 0)
> break;
> rc = session->push_msg (decoder->msg ());
> if (rc == -1) {
> errno_assert (errno == EAGAIN);
> return -1;
> }
> }
> }
> return 0;
> }
>
>>
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20180815/5d3ea5c8/attachment.htm>
More information about the zeromq-dev
mailing list