[zeromq-dev] Single Point of Failure & Recovery

Nir Sharony nsharony at gmail.com
Fri May 29 21:23:27 CEST 2020


Hi,

While going over the guide, I ran across the section
<http://zguide.zeromq.org/page:all#Shared-Queue-DEALER-and-ROUTER-sockets>
that talks about managing a broker between clients and servers.
What I need is something similar to the pattern below, except that instead
of REQ clients, I am using PUB publishers and instead of REP services, I
would like to use PULL workers.
[image: Screen Shot 2020-05-29 at 21.48.25.png]
This design attempts to explain what I want to achieve.
The publisher broadcasts messages that are handled by two brokers.
Each broker is in charge of its own group of workers (each group is
enclosed in a box in the design).
The brokers use the PUSH/PULL pattern to delegate the message to ONE of the
workers in their group.
I can bring as many workers as needed without changing the rest of the
system.
[image: Screen Shot 2020-05-29 at 22.13.09.png]
The blue elements are the pieces of code that actually do the work and I
can bring as many of them as needed.
However, the introduction of a broker seems to bring a "single point of
failure" into the system.
If the application that runs the broker dies, the entire module in its box
is no longer working...

Is this a known issue or am I missing something out?
What I want to make sure is that there are no single points of failure in
the system.
I thought of having two brokers per group (both will act the SUB consumers
for the PUB message).
However, I don't know if it is possible to split the work between such two
brokers so that they will delegate the message to only one worker in the
group.

Any assistance would be greatly appreciated.
Thanks,
Nir




On Fri, May 29, 2020 at 12:57 PM Doron Somech <somdoron at gmail.com> wrote:

> Hey, I suggest starting with the guide: http://zguide.zeromq.org/
>
> If you then have any questions about specific pattern I would be able to
> help.
>
> On Fri, May 29, 2020, 11:26 Nir Sharony <nsharony at gmail.com> wrote:
>
>> Hello fellow developers.
>>
>> I am trying to develop a robust distributed system that will allow
>> communication between Docker microservices, each being a separate process.
>>
>> The system needs to either run on the cloud or on-premise on dedicated
>> hardware.
>>
>> In order to decouple dependency between the various microservices. I
>> thought of using the  Pub/Sub communication pattern.
>>
>> I am fairly new to ZeroMQ, so I am wondering how it handles failures,
>> specifically in a
>>
>>
>>    -
>>
>>    Does the queue itself constitute a single point of failure?`
>>    -
>>
>>    Could the queue crash and the unsent payloads be lost?
>>    -
>>
>>    How should the system recover from a crash?
>>
>>
>> I would appreciate any references to documentation on this issue.
>>
>> Thanks in advance for your help.
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20200529/dee5d70b/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2020-05-29 at 21.48.25.png
Type: image/png
Size: 132308 bytes
Desc: not available
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20200529/dee5d70b/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screen Shot 2020-05-29 at 22.13.09.png
Type: image/png
Size: 163233 bytes
Desc: not available
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20200529/dee5d70b/attachment-0001.png>


More information about the zeromq-dev mailing list