[zeromq-dev] Single Point of Failure & Recovery

Bill Torpey wallstprog at gmail.com
Fri May 29 22:13:27 CEST 2020


A big question is what the “workers” do?

- It sounds like you want to process “transactions” — i.e., the workers need to be able to reliably update some shared state.  Is that true?
- If so, which of the ACID properties do you care about?
- Are your transactions “idempotent” — i.e., does it not matter if a transaction is executed multiple times?
- Do transactions need to ordered with respect to each other?  

Note that ZeroMQ is NOT a transaction processing system, nor is there any sort of “guaranteed delivery” option in ZeroMQ.  If you really are processing “transactions” then the tricky parts are going to be in your code — not ZeroMQ.  

It may make sense to flesh out that design before getting into details of the network.

HTH

> On May 29, 2020, at 3:46 PM, Doron Somech <somdoron at gmail.com> wrote:
> 
> I suggest not to use pub sub. Instead, if the number two is constant, you can have two dealers, one for each, connect multiple brokers to those dealers, now the broker are not a single point of failure. If you want to make it more scalable, you can have a router socket instead of the pub socket and smartly publish the message to multiple brokers, you will to manage the data structure and distribution algorithm yourself. Continue to read the guide, there multiple examples on how to use router together with your own algorithm and data structure. Also read the pubsub chapter (I think it is chapter 4), advanced request-response (5?) and look for the binary star pattern. 
> 
> On Fri, May 29, 2020, 22:24 Nir Sharony <nsharony at gmail.com <mailto:nsharony at gmail.com>> wrote:
> Hi,
> 
> While going over the guide, I ran across the section <http://zguide.zeromq.org/page:all#Shared-Queue-DEALER-and-ROUTER-sockets> that talks about managing a broker between clients and servers.
> What I need is something similar to the pattern below, except that instead of REQ clients, I am using PUB publishers and instead of REP services, I would like to use PULL workers.
> <Screen Shot 2020-05-29 at 21.48.25.png>
> This design attempts to explain what I want to achieve.
> The publisher broadcasts messages that are handled by two brokers. 
> Each broker is in charge of its own group of workers (each group is enclosed in a box in the design).
> The brokers use the PUSH/PULL pattern to delegate the message to ONE of the workers in their group.
> I can bring as many workers as needed without changing the rest of the system.
> <Screen Shot 2020-05-29 at 22.13.09.png>
> The blue elements are the pieces of code that actually do the work and I can bring as many of them as needed.
> However, the introduction of a broker seems to bring a "single point of failure" into the system.
> If the application that runs the broker dies, the entire module in its box is no longer working...
> 
> Is this a known issue or am I missing something out?
> What I want to make sure is that there are no single points of failure in the system.
> I thought of having two brokers per group (both will act the SUB consumers for the PUB message).
> However, I don't know if it is possible to split the work between such two brokers so that they will delegate the message to only one worker in the group.
> 
> Any assistance would be greatly appreciated.
> Thanks,
> Nir
> 
> 
> 
> 
> On Fri, May 29, 2020 at 12:57 PM Doron Somech <somdoron at gmail.com <mailto:somdoron at gmail.com>> wrote:
> Hey, I suggest starting with the guide: http://zguide.zeromq.org/ <http://zguide.zeromq.org/>
> 
> If you then have any questions about specific pattern I would be able to help.
> 
> On Fri, May 29, 2020, 11:26 Nir Sharony <nsharony at gmail.com <mailto:nsharony at gmail.com>> wrote:
> Hello fellow developers.
> 
> I am trying to develop a robust distributed system that will allow communication between Docker microservices, each being a separate process.
> The system needs to either run on the cloud or on-premise on dedicated hardware.
> 
> In order to decouple dependency between the various microservices. I thought of using the  Pub/Sub communication pattern.
> 
> I am fairly new to ZeroMQ, so I am wondering how it handles failures, specifically in a 
> 
> Does the queue itself constitute a single point of failure?`
> Could the queue crash and the unsent payloads be lost?
> How should the system recover from a crash?
> 
> I would appreciate any references to documentation on this issue.
> 
> Thanks in advance for your help.
> 
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org <mailto:zeromq-dev at lists.zeromq.org>
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev <https://lists.zeromq.org/mailman/listinfo/zeromq-dev>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org <mailto:zeromq-dev at lists.zeromq.org>
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev <https://lists.zeromq.org/mailman/listinfo/zeromq-dev>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> https://lists.zeromq.org/mailman/listinfo/zeromq-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20200529/ac71d1c3/attachment.htm>


More information about the zeromq-dev mailing list