[zeromq-dev] ZeroMQ how to troubleshoot the REQ-REP hang

Jithendra Reddy jithendra.reddy at gmail.com
Mon Mar 9 11:25:54 CET 2015


Hi,

Thanks much for Your reply.

I check the top output, everything (Processor and Memory) is in control.

It doesn't look like CPU/Memory issue.

During the stress test, sometimes server at REP socket is receiving the
message from REQ socket after a long delay or looks like some messages are
not reached at all to the server.

How to trouble shoot these kind of issues. How to ensure 100% reliability
in sending messages with good performance.

Your suggestions and ideas will be very helpful.

Regards

On Mon, Mar 9, 2015 at 12:33 PM, Michael Cuggy <mcuggy at gmail.com> wrote:

> lsof is a list of open files.  Having open files would certainly add
> to the utilization of processor and/or memory.  I would look to see if
> the processor or memory is constrained.  If 100% of the processor or
> memory are being utilized, then new network connections may not be
> able to be established.  The result would be consistent with the
> "hanging" behavior you observe.  The resources cannot establish
> additional network connections in this scenario.
>
> In theory they could accumulate to a critical amount after 15-20
> minutes.  It seems like you have almost solved the problem.
>
> On 3/8/15, Jithendra Reddy <jithendra.reddy at gmail.com> wrote:
> > Hi,
> >
> > We have implemented a REQ-REP socket communication. In brief the
> > application does the following:
> > 1. Client asks for a free tcp port through REQ socket
> > 2. Server listens at REP socket, looks for  a free tcp port. Forks a
> child
> > process and runs ZeroMQ REP socket listening at the free port. Parent
> > process sends back this free port detail to client
> > 3. Client then starts communicating to child process at the recieved port
> > using REQ-REP ZeroMQ socket
> >
> > The above application has issues, if we do stress test. We are running
> > nearly 30 clients in one minute. Stress test works fine for a while
> (15-20
> > minutes) and then hangs.
> >
> > We see that message is sent from REQ socket, but not recieved at REP
> > socket.
> > How to trouble shoot this issue?
> >
> > We see that lsof is increasing as the stress test progresses. We do close
> > sockets in the application and also set linger to 0. Could the increased
> > lsof cause hang?
> >
> > Your inputs to resolved the hang  and to troubleshoot the issue will be
> > helpful.
> >
> > Regards
> >
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20150309/9ae71ccc/attachment.htm>


More information about the zeromq-dev mailing list