[zeromq-dev] Java API is not notifed of C++ assert failures.

Vladimir & Mihaela puiuvlad at optonline.net
Fri Mar 13 03:16:53 CET 2009


Thanks Martin/Pieter,

The problem that I am trying to solve is to distribute a large amount of
computation that is inherently parallelizable amongst a (possibly large) set
of machines. Scalability is the main goal. Fail-over would be a side effect.
The application would not be geographically distributed and therefore would
not get dissynchronized.

The primary machine would be responsible for decision making, and as a
result, there should always be only one primary. If there's none or more
than one then there's trouble. Small intervals when there's no primary are
acceptable.

Once I have a set of machines up they should not require any manual
intervention. Any new machine that joins the collective would lighten the
load on the other machines. Any machine that leaves the collective would
increase the load. The first machine that comes up is the primary. When the
primary goes down another one assumes the role of primary. I believe this
mechanism is implementable if instead of killing the process that tries to
create a global object that already exists, the 0mq code would throw an
exception instead and let the application decide how to handle the error
condition. Is the decision to assert rather than throw exceptions final?

Another option to implement such a mechanism would be to use the reliable
multicast features of 0mq - any documentation and ETA on that?

Vladimir

P.S. Here's a high level description of how scalability would be achieved.

All machines start with an exact copy of the data and the work is carved
among themselves. Each machine processes its chunk and then multicasts the
deltas to the other machines. When a machine finishes its work it applies
the received deltas on the data such that at the end of this step the data
is again in sync. The primary machine is thus responsible for house keeping:
when to move to the next step, when to rebalance the work load, etc.




More information about the zeromq-dev mailing list