[zeromq-dev] Java API is not notifed of C++ assert failures.

Martin Sustrik sustrik at fastmq.com
Thu Mar 12 18:57:54 CET 2009


Pieter Hintjens wrote:
> On Thu, Mar 12, 2009 at 6:12 PM, Martin Sustrik <sustrik at fastmq.com> wrote:
> 
>> In short, automatic selection of primary is a complex problem studied
>> extensively by distributed algorithm scientists and as far as I am aware
>> it has no generic solution. My advice would be to select primary by hand
>> if at all possible.
> 
> For what it's worth, we spent several years refining and simplifying
> the failover in OpenAMQ and part of this was determining which process
> was primary, and the rules for failover.
> 
> What we concluded was that there are two viable architectures.  First,
> with N nodes where none is primary, and any can be removed or added.
> This requires many interconnections, and some manner of discovery, but
> is the most robust and scalable design.  Second, one primary node
> which automatically fails over to a secondary, both being defined
> explicitly so that applications know the difference.  Vital in this
> scheme is that it is not symmetric, and that recovery is done by
> external decision (by stopping the secondary when the primary is alive
> again).
> 
> If you need an algorithm for the second architecture, Martin can
> provide it, he wrote the high-availability engine in OpenAMQ.

Sure, just let me know if you need it. However, there are some hidden 
assumptions behind this model, the most important being that the primary 
and backup have to be able to communicate. Once the physical connection 
between the two is broken, you'll get a split-brain.

One more comment: With low-latency use cases 0MQ is supposed to be used 
for, the preferred way of getting high-availability is hot-hot failover, 
the architecture where the primary and the backup are _both_ processing 
all the messages. The results are then checked and duplicates are 
discarded. That kind of architecture results in zero latency impact of 
the failure of one of the servers.

Martin





More information about the zeromq-dev mailing list