[zeromq-dev] Java API is not notifed of C++ assert failures.

Martin Sustrik sustrik at fastmq.com
Wed Mar 18 10:50:28 CET 2009


Vladimir,

> I am many steps away from needing an HA solution for my application. I just
> want to make sure that the basic architecture would meet the requirements of
> a highly scalable and fault tolerant application from the get go, and would
> not come as an afterthought.
> 
> I am definitely not a specialist in decision making in uncentralized
> peer-to-peer systems, although I know something about Amdahl's law. And my
> goal is not to reinvent the wheel, so you got me convinced. Since your
> solution of splitting the HA and LB functionality is tested and it works, I
> am just going to try to fit my requiements into this paradigm.
> 
> These being said, in a load balancing situation, how would the decision
> maker know that a worker process has left the collective. It's easy in case
> the worker process terminates gracefully, but what if it crashes? Does the
> decision maker need to periodically ping every worker process? Do all worker
> processes need to periodically ping the decision maker saying, hey, I'm
> alive? Is there another solution?

Detection of the broken connection/application is done on TCP level. If 
TCP layer reports socket error, connection/application is considered 
unaccessible/failed.


> Second, I read the (early?) documentation about version 0.5 where it was
> mentioned that PGM protocol was not yet available on Windows (I am
> prototyping on a Windows laptop). Any ETA on when that would be available?
> Would it expose the same API as for Linux when it is available? Are there
> any tutorials on how to use it?

Yes, the API will be the same. There's some code using Win32 native 
impelmentation of PGM which kind of works but haven't been really tested 
yet. If you want to play with it, we can make it accessible. No 
guarantee that it will work flawlessly though.

> Lastly, how many instances of a zmq_server would one need? Does the number
> depend on the size of the network? Do I need to have one instance running on
> each machine? Is it a bottleneck if I have only one instance? How does a
> component know to connect to a particular instance for lookup purposes?

zmq_server is not a real message broker, rather it is a directory 
service that applications use to let others know that they exist and how 
they can be accessed. Thus the load on zmq_server is extremely light - 
it's used only when new connections are established.

Several instances of zmq_server would actually be a problem if they are 
not properly synchronised. I would strongly recommend going on with a 
single zmq_server instance.

Applications have to get zmq_server location at the startup. You can for 
instance specify the location of zmq_server on the command line.

Martin



More information about the zeromq-dev mailing list