[zeromq-dev] old issue

josh knox cz82mak at gmail.com
Wed Apr 20 23:29:25 CEST 2016


When I was having issues, (using a gnarly pre-existing code base) I looked
at where each socket call was coming from and verified that it was only
being used on one thread. Basically just wrote a message to the console for
each socket call and printed the thread ID, then analyzed the output.

In cases where refactoring to limit one per thread would be problematic, I
was able to use a mutex to allow exclusive access. This worked for me since
there were no performance implications.

If you're not certain, it might be worth confirming that things are indeed
thread safe.

FWIW, here's the thread where I muddled through this stuff previously:

http://lists.zeromq.org/pipermail/zeromq-dev/2015-December/029445.html

My crashes were also happening in encoder.hpp.

Josh



On Wed, Apr 20, 2016 at 4:58 PM, Joshua Strickon <strickon at media.mit.edu>
wrote:

> Its mostly single threaded but there could be multiple threads for
> different modules and dlls that it uses.  It is a bit of a mess and I don’t
> think the original developer fully tested it in the production
> environment.  I was hoping it would be something that upgrading to a later
> version of zmq addresses without having to dig into the application code.
>
> Thanks
>
> Josh
>
> On Apr 20, 2016, at 4:52 PM, josh knox <cz82mak at gmail.com> wrote:
>
> Hi Josh,
>
> Is your app multi-threaded? Could there be more than one thread hitting
> the socket?
>
> The times that I've had random memory errors with zmq were due to multiple
> threads using a socket.
>
> In my case, either isolating 1 thread per socket, or using other thread
> synchonization to prevent concurrent socket use has solved those issues for
> me.
>
>
> Josh
>
> On Wed, Apr 20, 2016 at 4:34 PM, Joshua Strickon <strickon at media.mit.edu>
> wrote:
>
>> I know this is old.  I am working on getting an old project up and
>> running for a client who
>> built it on 2.0.2 and we are seeing these same errors.  We are getting
>> access violation errors
>> and the app is crashing randomly.  The windows dump files are pointing to
>> these same lines of code as
>> described below.    What was the resolution on this issue?
>>
>> thanks
>>
>> Josh
>>
>> From: Martin Sustrik <sustrik <at> 250bpm.com>
>> Subject: Re: frequent ZeroMQ crashes - how to diagnose?
>> <http://news.gmane.org/find-root.php?message_id=4C1C6BF7.6080006%40250bpm.com>
>> Newsgroups: gmane.network.zeromq.devel
>> <http://news.gmane.org/gmane.network.zeromq.devel>
>> Date: 2010-06-19 07:04:23 GMT (5 years, 43 weeks, 5 days, 7 hours and 28
>> minutes ago)
>>
>> Nick,
>>
>> > ZeroMQ crashed today.
>> >
>> > This is a Win32 build of both ZMQ and myApp.
>> > myApp was running fine with several thousand messages, when the memcpy code line below threw the
>> following exception.
>> >
>> > "Unhandled exception at 0x6404edd6 (msvcr90d.dll) in myApp.exe: 0xC0000005: *Access* *violation*
>> reading location 0xfeeefeee."
>> >
>> > debugging shows the following values:
>> > -		buffer	0x00d9b570 "%"	unsigned char *
>> > 		pos	2	unsigned int
>> > +		write_pos	0xfeeefeee <Bad Ptr>	unsigned char *
>> > 		to_copy	8190	unsigned int
>> >
>> > looks like a bad pointer.
>> >
>> > *encoder*.*hpp*
>> >
>> >                 //  If there are no data in the buffer yet and we are able to
>> >                 //  fill whole buffer in a single go, let's use zero-copy.
>> >                 //  There's no disadvantage to it as we cannot stuck multiple
>> >                 //  messages into the buffer anyway. Note that subsequent
>> >                 //  write(s) are non-blocking, thus each single write writes
>> >                 //  at most SO_SNDBUF bytes at once not depending on how large
>> >                 //  is the chunk returned from here.
>> >                 //  As a consequence, large messages being sent won't block
>> >                 //  other engines running in the same I/O thread for excessive
>> >                 //  amounts of time.
>> >                 if (!pos && !*data_ && to_write >= buffersize) {
>> >                     *data_ = write_pos;
>> >                     *size_ = to_write;
>> >                     write_pos = NULL;
>> >                     to_write = 0;
>> >                     return;
>> >                 }
>> >
>> >                 //  Copy data to the buffer. If the buffer is full, return.
>> >                 size_t to_copy = std::min (to_write, buffersize - pos);
>> > =======>        memcpy (buffer + pos, write_pos, to_copy);
>> >                 pos += to_copy;
>> >                 write_pos += to_copy;
>> >                 to_write -= to_copy;
>> >                 if (pos == buffersize) {
>> >                     *data_ = buffer;
>> >                     *size_ = pos;
>> >                     return;
>> >                 }
>>
>> It looks like a memory overwrite either in 0MQ or the application. Do
>> you have a test program to reproduce the problem?
>>
>> > Let me know what the error was so that I can fix it in the trunk.
>>
>> Have you managed to find out what the error code is?
>>
>> Martin
>>
>>
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20160420/721de8b3/attachment.htm>


More information about the zeromq-dev mailing list