[zeromq-dev] rc == 0 (./zmq/mutex.hpp:94)

Aamir M aamirjvm at gmail.com
Mon Jul 13 18:15:54 CEST 2009


Hello,

I made some further modifications to mutex.hpp ... I changed the constructor to:

        inline mutex_t ()
        {
            int rc = pthread_mutex_init (&mutex, NULL);
            printf("0MQ MUTEX INIT: %p\n", (void*)&mutex);
            if (rc)
                posix_assert (rc);
        }

And the lock function to:

        inline void lock ()
        {
            int rc = pthread_mutex_lock (&mutex);
            if (rc)
            {
                printf("0MQ LOCK: %p\n", (void*)&mutex);
                posix_assert (rc);
            }
        }


I've attached the output. Notice that the mutex with memory address
0x100c9bc0 (which caused the lock to fail) seems to have never been
initialized (since the address doesn't appear anywhere else in the
output) ... Could it be that somehow the lock is being called before
pthread_mutex_init has completed?

One thing I don't understand ... I thought 0MQ uses lock-free queues
... so why is 0MQ calling mutex_lock so frequently? Is the mutex
member of each 0MQ message?

Thanks,
Aamir


On Mon, Jul 13, 2009 at 11:32 AM, Martin Hurton<hurtonm at gmail.com> wrote:
> Hi Aamir,
>
> Please apply the attached patch to 0.6.1 tree and let us know what's
> printed when the assertion fails.
>
> Regards,
> Martin
>
> On Mon, Jul 13, 2009 at 4:36 PM, Aamir M<aamirjvm at gmail.com> wrote:
>> Hello,
>>
>> We have a somewhat large/complex multi-threaded program that makes
>> heavy use of 0MQ for both process-scope and network-scope messaging.
>> Recently we implemented some changes and started seeing the following
>> error:
>>
>> Success
>> rc == 0 (./zmq/mutex.hpp:94)
>> Aborted
>>
>> 0MQ is asserting on ./zmq/mutex.hpp:94 and aborting the program.
>> Before the 0MQ assert occurs, some other function is causing the word
>> "Success" to be printed onto the screen.
>>
>> What could be causing this problem? It is proving very difficult to
>> debug this error because I have no idea which line triggers the
>> problem. Like any other bug related to a multi-threaded race
>> condition, the difficultly is compounded by the fact that the error
>> only occurs SOME of the time (i.e. it cannot be deterministically
>> reproduced).
>>
>> Does anyone have any ideas on how to isolate the offending code? When
>> does 0MQ use this pthread mutex and how could this assert happen while
>> sending / receiving messages?
>>
>> We have been careful to make sure that threads never share the same
>> zmq_api object ... each thread has its own instance of zmq_api, so I
>> don't think this could be the problem.
>>
>> Thanks,
>> Aamir
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>
-------------- next part --------------
[ RUN      ] zmq.test
0MQ MUTEX INIT: 0x10030aa8
0MQ MUTEX INIT: 0x10030b50
0MQ MUTEX INIT: 0x10030bc8
0MQ MUTEX INIT: 0x10030c40
0MQ MUTEX INIT: 0x10030cb8
0MQ MUTEX INIT: 0x10030d30
0MQ MUTEX INIT: 0x10030da8
0MQ MUTEX INIT: 0x10030e20
0MQ MUTEX INIT: 0x10030e98
0MQ MUTEX INIT: 0x10030f10
0MQ MUTEX INIT: 0x10030f88
0MQ MUTEX INIT: 0x10031000
0MQ MUTEX INIT: 0x10031078
0MQ MUTEX INIT: 0x100310f0
0MQ MUTEX INIT: 0x10031168
0MQ MUTEX INIT: 0x100311e0
0MQ MUTEX INIT: 0x10031258
0MQ MUTEX INIT: 0x100312d0
0MQ MUTEX INIT: 0x10031348
0MQ MUTEX INIT: 0x100313c0
0MQ MUTEX INIT: 0x10031438
0MQ MUTEX INIT: 0x100314b0
0MQ MUTEX INIT: 0x10031528
0MQ MUTEX INIT: 0x100315a0
0MQ MUTEX INIT: 0x10031618
0MQ MUTEX INIT: 0x10031690
0MQ MUTEX INIT: 0x10031708
0MQ MUTEX INIT: 0x10031780
0MQ MUTEX INIT: 0x100317f8
0MQ MUTEX INIT: 0x10031870
0MQ MUTEX INIT: 0x100318e8
0MQ MUTEX INIT: 0x10031960
0MQ MUTEX INIT: 0x100319d8
0MQ MUTEX INIT: 0x10031a50
0MQ MUTEX INIT: 0x10031ac8
0MQ MUTEX INIT: 0x10031b40
0MQ MUTEX INIT: 0x10031bb8
0MQ MUTEX INIT: 0x10031c30
0MQ MUTEX INIT: 0x10031ca8
0MQ MUTEX INIT: 0x10031d20
0MQ MUTEX INIT: 0x10031d98
0MQ MUTEX INIT: 0x10031e10
0MQ MUTEX INIT: 0x10031e88
0MQ MUTEX INIT: 0x10031f00
0MQ MUTEX INIT: 0x10031f78
0MQ MUTEX INIT: 0x10031ff0
0MQ MUTEX INIT: 0x10032068
0MQ MUTEX INIT: 0x100320e0
0MQ MUTEX INIT: 0x10032158
0MQ MUTEX INIT: 0x100321d0
0MQ MUTEX INIT: 0x10032248
0MQ MUTEX INIT: 0x100322c0
0MQ MUTEX INIT: 0x10032338
0MQ MUTEX INIT: 0x100323b0
0MQ MUTEX INIT: 0x10032428
0MQ MUTEX INIT: 0x100324a0
0MQ MUTEX INIT: 0x10032518
0MQ MUTEX INIT: 0x10032590
0MQ MUTEX INIT: 0x10032608
0MQ MUTEX INIT: 0x10032680
0MQ MUTEX INIT: 0x100326f8
0MQ MUTEX INIT: 0x10032770
0MQ MUTEX INIT: 0x100327e8
0MQ MUTEX INIT: 0x10032860
0MQ MUTEX INIT: 0x100328d8
0MQ MUTEX INIT: 0x10032950
0MQ MUTEX INIT: 0x100329c8
0MQ MUTEX INIT: 0x10032a40
0MQ MUTEX INIT: 0x10032ab8
0MQ MUTEX INIT: 0x10032b30
0MQ MUTEX INIT: 0x10032ba8
0MQ MUTEX INIT: 0x10032c20
0MQ MUTEX INIT: 0x10032c98
0MQ MUTEX INIT: 0x10032d10
0MQ MUTEX INIT: 0x10032d88
0MQ MUTEX INIT: 0x10032e00
0MQ MUTEX INIT: 0x10032e78
0MQ MUTEX INIT: 0x10032ef0
0MQ MUTEX INIT: 0x10032f68
0MQ MUTEX INIT: 0x10032fe0
0MQ MUTEX INIT: 0x10033058
0MQ MUTEX INIT: 0x100330d0
0MQ MUTEX INIT: 0x10040228
0MQ MUTEX INIT: 0x10041148
0MQ MUTEX INIT: 0x10041328
0MQ MUTEX INIT: 0x10045018
0MQ MUTEX INIT: 0x100459b8
0MQ MUTEX INIT: 0x10049438
0MQ MUTEX INIT: 0x10049530
0MQ MUTEX INIT: 0x100495a8
0MQ MUTEX INIT: 0x10049620
0MQ MUTEX INIT: 0x10049698
0MQ MUTEX INIT: 0x10049710
0MQ MUTEX INIT: 0x10049788
0MQ MUTEX INIT: 0x10049800
0MQ MUTEX INIT: 0x10049878
0MQ MUTEX INIT: 0x100498f0
0MQ MUTEX INIT: 0x10049968
0MQ MUTEX INIT: 0x100499e0
0MQ MUTEX INIT: 0x10049a58
0MQ MUTEX INIT: 0x10049ad0
0MQ MUTEX INIT: 0x10049b48
0MQ MUTEX INIT: 0x10049bc0
0MQ MUTEX INIT: 0x10049c38
0MQ MUTEX INIT: 0x10049cb0
0MQ MUTEX INIT: 0x10049d28
0MQ MUTEX INIT: 0x10049da0
0MQ MUTEX INIT: 0x10049e18
0MQ MUTEX INIT: 0x10049e90
0MQ MUTEX INIT: 0x10049f08
0MQ MUTEX INIT: 0x10049f80
0MQ MUTEX INIT: 0x10049ff8
0MQ MUTEX INIT: 0x1004a070
0MQ MUTEX INIT: 0x1004a0e8
0MQ MUTEX INIT: 0x1004a160
0MQ MUTEX INIT: 0x1004a1d8
0MQ MUTEX INIT: 0x1004a250
0MQ MUTEX INIT: 0x1004a2c8
0MQ MUTEX INIT: 0x1004a340
0MQ MUTEX INIT: 0x1004a3b8
0MQ MUTEX INIT: 0x1004a430
0MQ MUTEX INIT: 0x1004a4a8
0MQ MUTEX INIT: 0x1004a520
0MQ MUTEX INIT: 0x1004a598
0MQ MUTEX INIT: 0x1004a610
0MQ MUTEX INIT: 0x1004a688
0MQ MUTEX INIT: 0x1004a700
0MQ MUTEX INIT: 0x1004a778
0MQ MUTEX INIT: 0x1004a7f0
0MQ MUTEX INIT: 0x1004a868
0MQ MUTEX INIT: 0x1004a8e0
0MQ MUTEX INIT: 0x1004a958
0MQ MUTEX INIT: 0x1004a9d0
0MQ MUTEX INIT: 0x1004aa48
0MQ MUTEX INIT: 0x1004aac0
0MQ MUTEX INIT: 0x1004ab38
0MQ MUTEX INIT: 0x1004abb0
0MQ MUTEX INIT: 0x1004ac28
0MQ MUTEX INIT: 0x1004aca0
0MQ MUTEX INIT: 0x1004ad18
0MQ MUTEX INIT: 0x1004ad90
0MQ MUTEX INIT: 0x1004ae08
0MQ MUTEX INIT: 0x1004ae80
0MQ MUTEX INIT: 0x1004aef8
0MQ MUTEX INIT: 0x1004af70
0MQ MUTEX INIT: 0x1004afe8
0MQ MUTEX INIT: 0x1004b060
0MQ MUTEX INIT: 0x1004b0d8
0MQ MUTEX INIT: 0x1004b150
0MQ MUTEX INIT: 0x1004b1c8
0MQ MUTEX INIT: 0x1004b240
0MQ MUTEX INIT: 0x1004b2b8
0MQ MUTEX INIT: 0x10055c68
0MQ MUTEX INIT: 0x100562f8
0MQ MUTEX INIT: 0x10062248
0MQ MUTEX INIT: 0x10065718
0MQ MUTEX INIT: 0x10069f28
0MQ MUTEX INIT: 0x10071078
0MQ MUTEX INIT: 0x10074228
0MQ MUTEX INIT: 0x10074438
0MQ MUTEX INIT: 0x10080d18
0MQ MUTEX INIT: 0x1008d728
0MQ MUTEX INIT: 0x1008d918
0MQ MUTEX INIT: 0x10090b18
0MQ MUTEX INIT: 0x100940e8
0MQ MUTEX INIT: 0x100941c0
0MQ MUTEX INIT: 0x10094238
0MQ MUTEX INIT: 0x100942b0
0MQ MUTEX INIT: 0x10094328
0MQ MUTEX INIT: 0x100943a0
0MQ MUTEX INIT: 0x10094418
0MQ MUTEX INIT: 0x10094490
0MQ MUTEX INIT: 0x10094508
0MQ MUTEX INIT: 0x10094580
0MQ MUTEX INIT: 0x100945f8
0MQ MUTEX INIT: 0x10094670
0MQ MUTEX INIT: 0x100946e8
0MQ MUTEX INIT: 0x10094760
0MQ MUTEX INIT: 0x100947d8
0MQ MUTEX INIT: 0x10094850
0MQ MUTEX INIT: 0x100948c8
0MQ MUTEX INIT: 0x10094940
0MQ MUTEX INIT: 0x100949b8
0MQ MUTEX INIT: 0x10094a30
0MQ MUTEX INIT: 0x10094aa8
0MQ MUTEX INIT: 0x10094b20
0MQ MUTEX INIT: 0x10094b98
0MQ MUTEX INIT: 0x10094c10
0MQ MUTEX INIT: 0x10094c88
0MQ MUTEX INIT: 0x10094d00
0MQ MUTEX INIT: 0x10093f98
0MQ MUTEX INIT: 0x40010000988
0MQ MUTEX INIT: 0x100a1598
0MQ MUTEX INIT: 0x100a8aa8
0MQ MUTEX INIT: 0x100afdd8
0MQ MUTEX INIT: 0x100bb3e8
0MQ MUTEX INIT: 0x40010005248
0MQ MUTEX INIT: 0x100c6ad8
0MQ MUTEX INIT: 0x400100083e8
0MQ LOCK: 0x100c9bc0
Invalid argument (./zmq/mutex.hpp:100)
Aborted


More information about the zeromq-dev mailing list