[zeromq-dev] Zmq Assert

Antonio Teixeira eagle.antonio at gmail.com
Thu Apr 19 10:39:20 CEST 2012


Good Morning All :)
So i have found the problem and a fix ( not a solution for it )

I can't explain the reason but using the logging module apparently causes
the parent to SIGABORT although the application (CHILDs) still runs on the
same logging module perfectly fine using gevent.
I know that logging uses threading with some lockings but i can't pinpoint
why this SIGAborts zmq.

Solution :
Simply don't log on the parent i made a small fix to the logging
http://pastebin.com/E36EixYR

To support Async / Sync and shared contexts to log operations but sill no
improvement.

Anyway if some of the devs wanna take a lot feel free to mail me off list
or if you want to test a fix or something like that.
Im leaving this as is and moving on to another parts of the applications.

Regards
Antonio Teixeira



2012/4/16 Antonio Teixeira <eagle.antonio at gmail.com>

> Do you guys need more data ? i will dedicate the afternoon to this problem
> :) so feel free you want me to experiment anything.
>
>
>
> 2012/4/13 Antonio Teixeira <eagle.antonio at gmail.com>
>
>> For The Pro personnel :)
>> I do have a strace dunno if it helps
>>
>> Regards
>> A/T
>>
>>
>>
>> 2012/4/13 Antonio Teixeira <eagle.antonio at gmail.com>
>>
>>> Hello All.
>>>
>>> The problem trying to recreate the problem is easy the main issue here
>>> is cutting it in parts since the problems happens in our logging module not
>>> the application itself and the logging module is based on the PUBHandler
>>> pyzmq provides but lets start :
>>>
>>> Python
>>> Python 2.6.5 (r265:79063, Apr 16 2010, 13:09:56)
>>>
>>> Gevent
>>> '1.0b2'
>>>
>>> PyZmq
>>> '2.1dev'
>>>
>>> ZMQ
>>> v2.2.0
>>>
>>> I'm using the zmq.green for the gevent compatible implementation.
>>>
>>> You can see a simple trace of the log here :
>>> http://pastebin.com/7W2KH82h
>>>
>>> Explanation
>>> *
>>> Parent* :
>>> Import Logging Module
>>> This One
>>> http://pastebin.com/hmbwUB1D
>>>
>>> You can see that it creates its own context on init.
>>>
>>> ** Calling It From The Parent
>>>
>>>         #
>>>         # Create A Network Context Shared , We already have a dedicated
>>> CPU socket.
>>>         #
>>>
>>>         self.networkContext = None
>>>
>>>         #
>>>         # Load Logging
>>>         #
>>>
>>>         self.logger =
>>> Logger(loggingSocket=self.Logging_Socket,networkContext=self.networkContext)
>>>         self.log = self.logger.logger
>>>
>>> BTW loggingSocket is a tuple (ip:port)
>>>
>>> Ok , We log the parent properly
>>>
>>> We then ask server modules to be raised
>>> Starting To Raise Server Modules ... in the log
>>>
>>> This Make a simple call to
>>> from multiprocessing import cpu_count , Process
>>>             task_api = Process(target=TaskAPI, args=(
>>>                                                     self.TaskAPI_Socket,
>>>                                                     self.TaskAPI_IPC,
>>>                                                     self.TaskAPI_Pool,
>>>                                                     self.Logging_Socket))
>>>             task_api.start()
>>>
>>> Once again
>>> self.Logging_Socket is ('127.0.0.1', 2222) just to be sure :D
>>>
>>> Ok TaskAPI imports once again
>>> Import Logging Module
>>> This One
>>> http://pastebin.com/hmbwUB1D
>>>
>>> And Starts dumping logs and it works as you can see on the log file.
>>>
>>> and ...
>>>
>>> [<Process(Process-1, started)>]
>>>
>>> Assertion failed: ok (mailbox.cpp:84)
>>> Aborted
>>>
>>> The Parents aborts and dies
>>> Trying to pinpoint. I did
>>>             # Log
>>>             for i in range(1000):
>>>
>>>                 self.log.info('All The Server Modules Were Raised For
>>> Task Processing - %s.' % i)
>>> Ater i start TaskAPI
>>>
>>> It should make 1000 entries although just 8 are printed since the parent
>>> aborts while printing the rest.
>>>
>>> But i found that if i delay / comment all the
>>> self.log.debug
>>> self.log.erro
>>> entries in the Task API the parent keeps going on it appears to be
>>> something caused not when the logging modules starts and creates the zmq
>>> context but when we "emit" a message in the logging module.
>>> If there is no logging everything keeps running.
>>>
>>> Hope you guys can give me a hand.
>>>
>>> Regards
>>>
>>>
>>>
>>>
>>>
>>>
>>> 2012/4/13 Martin Hurton <hurtonm at gmail.com>
>>>
>>>> How Antonio, how difficult is to reproduce this failure?
>>>>
>>>> - Martin
>>>>
>>>> On Thu, Apr 12, 2012 at 5:45 PM, Antonio Teixeira
>>>> <eagle.antonio at gmail.com> wrote:
>>>> > Hello Friends.
>>>> >
>>>> > I have hitting a wall since i updated to the latest of ZMQ and PyZmq
>>>> >
>>>> > Assertion failed: ok (mailbox.cpp:84)
>>>> >
>>>> > I have tried everything i can remember , probably this is a problem
>>>> in Zmq
>>>> > and not PyZmq but i wanted to get the opinion of a more experience
>>>> member
>>>> > of the community first.
>>>> >
>>>> > Scenario :
>>>> >
>>>> > Parent thread forks a child
>>>> > both run the exact same code a exception happens in the parent that
>>>> > terminates but the child survives.
>>>> >
>>>> > This is a simple PUB / SUB where the PUB does connect and the SUB
>>>> binds.
>>>> > The side that crashes is the Publisher.
>>>> >
>>>> > Regards
>>>> > A/T
>>>> >
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > zeromq-dev mailing list
>>>> > zeromq-dev at lists.zeromq.org
>>>> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>> >
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> zeromq-dev at lists.zeromq.org
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20120419/e0277a72/attachment.htm>


More information about the zeromq-dev mailing list