[zeromq-dev] Dealing with client-side crash for REQ/REP

Michi Henning michi at triodia.com
Wed Jan 8 00:10:25 CET 2014


I'm looking for some advice as to what happens or should happen in a particular failure scenario.
My searches using Google have come up empty. Forgive me if this has been canvassed before.
(There is lots of stuff about recovering from a server crash, but I couldn't find anything about
a client crash.)

I have a server that listens for incoming requests from various clients. The server uses
router/dealer to accept incoming requests from multiple clients and farm them out to
backend worker threads. For clients, requests are two-way and synchronous, that is,
the client blocks waiting for the reply with a timeout. It's basically the design straight
from the guide:

REQ-ROUTER-queue-DEALER-REP

Where the REQ socket is in the client and everything else is in the server. (The
transport between REQ and ROUTER is ipc, but might end up being tcp some
time in the future.)

In the server, the workers use poll() to wait for a message to be retrieved from the dealer,
then they are busy working for some time and, when the task is complete, they call send(),
which sends the result of the task back to the dealer socket from which the request
was read.

So, in essence, each worker does something like

- wait until request is ready to read using poll()
- recv() (from dealer)
- process request (might take a while)
- send(reply) (to dealer)

My question is what happens if the *client* crashes after it has sent the request, while
the worker is processing the request. Eventually, the task completes in the server,
and the worker calls send() to pass the reply to the dealer. In turn, the router/dealer
message pump calls send() on its frontend router socket to send the reply back to the now
non-existent client.

What happens in this case? Basically, if the client crashes while the request was
processing in the server, there is no point in ever returning a reply. Moreover,
the client may *never* come back to life again, so there is absolutely no point
in waiting for the client to re-appear and getting rid of the reply later.

So, what happens to the reply message that is sent by the router/dealer message pump
towards the client? Will it be discarded, or do I need to do that myself? It's not clear to me
whether calling send() with DONT_BLOCK is sufficient. Basically, what I want to avoid is
that the router/dealer gets itself into a non-working state if a client crashes at the wrong
moment, as well as making sure that replies sent to the router (front-end) socket don't
pile up somewhere in the server's address space.

In essence, I want to make sure that my router/dealer doesn't block, and that any replies
that cannot be returned to clients because of a client crash are discarded on the server
side.

Thanks,

Michi.




More information about the zeromq-dev mailing list