[zeromq-dev] [jeroMQ] Losing messages on Request Reply communication
Douglas Lopes Pereira
douglaslopespereira at gmail.com
Wed Oct 12 00:31:29 CEST 2016
Hi everyone,
we are struggling with a problem in our implementation that uses jeroMQ
0.3.5 and would like to ask you for advises. We basically have several
producers ("Requesters") running on multiple instances pointing to a single
instance listening on different ports (each port representing a different
"Replyer").
The message exchange is simple. We send a milestone report as a REQ and
expect something we call "Async answer" as a REP back. I see that message
flow working for several requests but one of them eventually fails to get
the reply back. That request that fails is usually generated by the same
producer. I can reproduce this behavior from time to time.
I'm sharing with you a TCP dump captured on the "Replyer" instance. Please
notice that on frame 75 there a Request arriving on port 15002 which never
gets a "Async Reply" back. Why would the OS (Ubuntu linux 12.04.5 LTS) not
deliver that message to 0MQ?
I'm saying it never delivered to 0MQ because I added a log right after the
0MQ socket recvStr() method call. And this particular message is not
printed under this circumstance.
This is the Replyer code:
try (ZContext context = new ZContext()) {
ZMQ.Socket responder = context.createSocket(ZMQ.REP);
try {
responder.bind(String.format("tcp://*:%d", port));
responder.setLinger(0);
responder.setReceiveTimeOut(1000);
while (isStarted()) {
// Wait for next request from the client and do the work
String request = responder.recvStr();
LOG.info("Got request: {}", request);
if (request == null) {
continue;
}
...
} finally {
terminated = true;
context.destroySocket(responder);
LOG.warn("Reply socket is going down");
}
} catch (Exception e) {
LOG.error(e.getMessage(), e);
}
I tried to reproduce it simulating the Request messages using two python
scripts pointing to the same "Replyer" but the error didn't happen.
I don't see any exception on the log. I don't see any
Any ideas on how to debug that?
Thanks in advance,
Douglas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20161011/4590472c/attachment.htm>
More information about the zeromq-dev
mailing list