[zeromq-dev] Slow joiner syndrome solution for XSUB/XPUB broker based system
Chris Billington
chrisjbillington at gmail.com
Mon May 28 12:21:18 CEST 2018
I've handled this problem by avoiding using a PUB socket for the senders of
messages:
a) senders of messages send them on a PUSH socket and the broker forwards
from a PULL to a XPUB. This means that there is no slow joiner problem with
the senders starting up (PUSH won't drop messages), but has the downside
that the messages are *always* sent to the broker even if there are no
subscribers. They will instead be dropped by the XPUB if there are no
subscribers.
b) Subscribers request and wait for subscription confirmation messages from
the broker when they subscribe to a topic so calling code can be sure they
are subscribed before starting the senders.
See here for my Python project that implements this (the EventBroker and
Event classes):
https://bitbucket.org/cbillington/zprocess/src/default/zprocess/process_tree.py?at=default&fileviewer=file-view-default#process_tree.py-102
On Mon, May 28, 2018 at 7:40 PM, Tomer Eliyahu <tomereliyahu1 at gmail.com>
wrote:
> Hi Gyorgy,
>
> Thank you - but assuming the subscriber connect and subscribe happen long
> before the publisher starts, is there still a risk for the slow joiner
> problem?
>
> Assume the following flow:
> broker:
> zmq_bind(frontend, "ipc:///tmp/publishers");
> zmq_bind(backend, "ipc:///tmp/subscribers");
> zmq_proxy(frontend, backend, NULL);
>
> <wait 2 seconds and start subscriber process>
>
> subscriber:
> zmq_connect(sub_socket, "ipc:///tmp/subscribers");
> <subscribe to "TEST" topic>
> <receive message from sub_socket - blocking>
>
> <wait 2 seconds and start publisher process>
>
> publisher:
> zmq_connect(pub_socket, "ipc:///tmp/publishers");
> zmq_connect(sub_socket, "ipc:///tmp/subscribers");
> <subscribe to "SYNC" topic>
> <sync - send DUMMY messages until received>
> <unsubscribe to "SYNC" topic>
> <send message with "TEST" topic through pub_socket>
> <terminate>
>
> Bottom line - is there some sort of synchronization done under the hood by
> ZMQ when the publisher first sends a message with the topic on which the
> subscriber subscribed? or is this all handled between the broker and the
> subscriber?
>
> Thanks,
> Tomer
>
> On Mon, May 28, 2018 at 12:23 PM, Gyorgy Szekely <hoditohod at gmail.com>
> wrote:
>
>> Hi Tomer
>> As far as I know the message from the publisher will reach the broker.
>> According to the docs, the PUB socket drops messages in mute-state (HWM
>> reached), and it's not the case here. The message will be sent as soon as
>> the connection is established, and the socket termination blocks until the
>> send is complete. Unless you set linger to zero.
>>
>> The slow joiner problem means that subscriptions may not be active by the
>> time the publisher send the message. Either because the subscriber is not
>> yet running, or because the subscribe calls themselves are asynchronous (by
>> the time setsockopt(SUNSCRIBE) returns the broker is not yet aware of
>> this). The zmq guide shows mitigations for this problem in the Advanced
>> Publish Subscribe chapter.
>>
>> Regards,
>> Gyorgy
>>
>> On Mon, May 28, 2018 at 11:06 AM, Tomer Eliyahu <tomereliyahu1 at gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>>
>>>
>>> I know this topic was probably discussed before, I couldn't find a
>>> proper solution, so I implemented something a bit different. I'm not sure
>>> if this solves all pitfalls, i'll be greatfull for comments.
>>>
>>>
>>>
>>> We have a system with a XPUB-XSUB broker running as a separate process
>>> in the system (binds frontend to ipc:///tmp/publishers and backend to
>>> ipc:///tmp/subscribers).
>>>
>>>
>>>
>>> Clients of the broker have both SUB socket for receiving messages, and a
>>> PUB socket for sending messages. When a client boots, it connects both its
>>> PUB and SUB sockets to the broker's endpoints, and subscribes to the topic
>>> of interest.
>>>
>>>
>>> For the sake of simplicity, lets assume there we have only the broker, a
>>> publisher and a subscriber processes in the system:
>>>
>>> We make sure that the broker process starts first, then a subscriber
>>> which connects and subscribes to the topic, and only then start the
>>> publisher. The publisher then sends a single message and terminates.
>>>
>>> Obviously, the message is lost due to the slow joiner syndrome - I
>>> assume the reason for that is because the publisher process zmq_connect()
>>> call is asynchronous, therefore the connect is not actually complete by the
>>> time we send the message.
>>>
>>>
>>>
>>> I thought of a possible solution for this - basically we want to
>>> synchronize the connect operation done by the publisher. Having both PUB
>>> and SUB socket, we can simply send a dummy message from PUB to SUB on the
>>> same publisher process until the first message is receieved, and then it is
>>> guarantied that the connect is done and consecutive messages (now to "real"
>>> topics with actual subscribers) will not be lost.
>>>
>>>
>>>
>>> The only part i'm not sure about is the subscriber side - assuming the
>>> subscriber boots, connects and subscribes _before_ we start the publisher -
>>> is it guarantied that no message will be lost (assuming ofcourse the
>>> subscriber doesn't crash / unsubscribe / etc.) ?
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Tomer
>>>
>>>
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>>
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20180528/ac96c139/attachment.htm>
More information about the zeromq-dev
mailing list