[zeromq-dev] Java API is not notifed of C++ assert failures.

Vladimir & Mihaela puiuvlad at optonline.net
Fri Mar 20 03:12:58 CET 2009


Martin,

I have two processes - a dispatcher and a worker. The dispatcher creates a
global queue and a global exchange. The worker creates a local queue and a
local exchange, and binds each of them to the opposite object in the
dispatcher. The dispatcher listens to the worker request and replies right
away (in the same thread). The worker sends 10 messages to the dispatcher
and goes on doing work. On a different thread, the worker listens to the
dispatcher responses.

I tried to run this test using the Java API. Here is the worker log:

17:33:20,551  INFO Process:87 - Sending: sim 169.254.25.129:default 0
17:33:20,561  INFO Process:106 - Received: dispatcher echo: sim
169.254.25.129:default 0
17:33:22,564  INFO Process:87 - Sending: sim 169.254.25.129:default 1
17:33:24,567  INFO Process:87 - Sending: sim 169.254.25.129:default 2
17:33:26,570  INFO Process:87 - Sending: sim 169.254.25.129:default 3
17:33:28,572  INFO Process:87 - Sending: sim 169.254.25.129:default 4

As you can see, the worker thread receives only the first reply; the rest
are lost.

I then re-read the 0MQ documentation on the Java API: "Java extension
creates single I/O thread that can be accessed from a single application
thread. ... it is our intent to expose full 0MQ API via Java in the future."
Any ETA on this?

Also, your explanation regarding how would the dispatcher detect when a
worker thread disconnects is still not clear to me.

>From an application perspective, the dispatcher (HA pair, when available)
responds to the "join the collective" message sent by an unknown number of
worker processes. If a worker disconnects gracefully, it can send a "leave
the collective" message to the dispatcher. I would like to be able to detect
when a worker has crashed (for example) and is no longer available. My tests
show that when that happens the dispatcher does not crash (as I was fearing
your reply implied).

Any suggestions?

Finally, yes I am interested in playing with the Win32 native implementation
of PGM. Do I need any special setup in Windows? Can you point me to the
appropriate documentation?

Thanks,
Vladimir

-----Original Message-----
From: Martin Sustrik [mailto:sustrik at fastmq.com]
Sent: Wednesday, March 18, 2009 4:50 AM
To: Vladimir & Mihaela
Cc: zeromq-dev at lists.zeromq.org
Subject: Re: [zeromq-dev] Java API is not notifed of C++ assert
failures.


Vladimir,

> I am many steps away from needing an HA solution for my application. I
just
> want to make sure that the basic architecture would meet the requirements
of
> a highly scalable and fault tolerant application from the get go, and
would
> not come as an afterthought.
>
> I am definitely not a specialist in decision making in uncentralized
> peer-to-peer systems, although I know something about Amdahl's law. And my
> goal is not to reinvent the wheel, so you got me convinced. Since your
> solution of splitting the HA and LB functionality is tested and it works,
I
> am just going to try to fit my requiements into this paradigm.
>
> These being said, in a load balancing situation, how would the decision
> maker know that a worker process has left the collective. It's easy in
case
> the worker process terminates gracefully, but what if it crashes? Does the
> decision maker need to periodically ping every worker process? Do all
worker
> processes need to periodically ping the decision maker saying, hey, I'm
> alive? Is there another solution?

Detection of the broken connection/application is done on TCP level. If
TCP layer reports socket error, connection/application is considered
unaccessible/failed.


> Second, I read the (early?) documentation about version 0.5 where it was
> mentioned that PGM protocol was not yet available on Windows (I am
> prototyping on a Windows laptop). Any ETA on when that would be available?
> Would it expose the same API as for Linux when it is available? Are there
> any tutorials on how to use it?

Yes, the API will be the same. There's some code using Win32 native
impelmentation of PGM which kind of works but haven't been really tested
yet. If you want to play with it, we can make it accessible. No
guarantee that it will work flawlessly though.

> Lastly, how many instances of a zmq_server would one need? Does the number
> depend on the size of the network? Do I need to have one instance running
on
> each machine? Is it a bottleneck if I have only one instance? How does a
> component know to connect to a particular instance for lookup purposes?

zmq_server is not a real message broker, rather it is a directory
service that applications use to let others know that they exist and how
they can be accessed. Thus the load on zmq_server is extremely light -
it's used only when new connections are established.

Several instances of zmq_server would actually be a problem if they are
not properly synchronised. I would strongly recommend going on with a
single zmq_server instance.

Applications have to get zmq_server location at the startup. You can for
instance specify the location of zmq_server on the command line.

Martin




More information about the zeromq-dev mailing list