[zeromq-dev] XPUB(or PUB) aborted by assertion failure (erased == 1 on mtrie.cpp)

Heungsub Lee sub at subl.ee
Mon Dec 12 20:03:04 CET 2016


Was my email sent well?  I sent this email yesterday, but I couldn't see at
the archive.

2016년 12월 12일 (월) 오전 3:55, Heungsub Lee <sub at subl.ee>님이 작성:

Hi folks, I'm Heungsub Lee.

I've been making a game server with ZeroMQ's Pub/Sub approach.  I got a
critical problem by using PUB/SUB sockets.  Sometimes my processes are
aborted with assertion failure from ZeroMQ:

Assertion failed: erased == 1 (src/mtrie.cpp:297)

I tried with pyzmq-16.0.2 over libzmq-4.2.0.

In my case, a SUB socket binds to an address then a PUB socket connects to
the address.  All of PUB sockets and SUB sockets in a cluster connect with
each others.  They makes a fully connected network among 500+ server
processes.

A SUB socket frequently subscribes or unsubscribes their topics.  The
topics in a cluster grow up since the cluster started.  At a moment when I
checked, one of SUB sockets is subscribing 3000+ topics.

I saw 3 aborting scenarios:

   1. When a SUB socket closes, some PUB sockets abort.  Perhaps it is a
   concurrency bug from pyzmq what I'm using.  I reproduced it by a test
   case
   <https://github.com/what-studio/pyzmq/commit/5159ee563a571daccf1285aa74917bb875c774a7>.
   And I think I fixed it
   <https://github.com/what-studio/pyzmq/commit/94ab0a88dbef7d0f33b34cdf18e55487735dde01>
   .
   2. When a PUB socket joins to a mature cluster it aborts almost
   immediately.  A mature cluster means there are already so many subscribing
   topics and subscribe/unsubscribe synchronization messages.
   3. A PUB socket on a weak host machine (e.g. AWS EC2 t2.medium),
   sometimes aborts.  I'm not sure what is the point.

Unfortunately, I couldn't reproduce the last 2 scenarios by a small code.
But my server still has been aborted.

The assertion failure occurs when a PUB socket tries to remove a pipe to a
SUB socket but there's no matched pipe.  I'm wondering if ZeroMQ guarantees
the consistency of subscribe/unsubscribe synchronizations between busy PUB
and SUB sockets.

Regards,
Heungsub
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20161212/e74ff6a0/attachment.htm>


More information about the zeromq-dev mailing list