[zeromq-dev] Can't get ZMQ_REQ_RELAXED to run

Björn Kuhlbrodt bjoern.kuhlbrodt at gpsolar.com
Wed Nov 27 15:57:10 CET 2013


Hi Christian

A minimal test case was not feasible, it's timing dependent and possibly a threading problem. 

Instead, I debugged into the zmq sources. That was painful because of the timing issue. What I saw was, that the first zmq_msg_send called zmq::req_t::xsend which called zmq::lb_t::sendpipe which failes at "if (pipes [current]->write (msg_))" because out_active was false. No idea why, tough. If I step into it, everything works. Only traces or DebugOutput shows me the bad behavior.

A fix was to give zmq all the time it wants to send a package. This shouldn't change the behavior for the first packet, but it did. Which makes me think that some of the other threads (with different sockets but the same context) might have to do something with it.

Anyway, works for me now. Still too bad, that I couldn't nail it down.

Regards
Björn

-----Original Message-----
From: zeromq-dev-bounces at lists.zeromq.org [mailto:zeromq-dev-bounces at lists.zeromq.org] On Behalf Of Christian Kamm
Sent: Dienstag, 19. November 2013 16:43
To: ZeroMQ development list
Subject: Re: [zeromq-dev] Can't get ZMQ_REQ_RELAXED to run

Hi Björn,

It'd be great if you could try reducing the issue to a test case. Right now I can't determine what's going on.

On 11/19/2013 03:54 PM, Björn Kuhlbrodt wrote:
> * It is the very first send on that socket that fails. But I'm gaining 
> the impression that to 10% case where the send succeeds are those 
> where I haven't started the application in a while (>~10 minutes, not 
> confirmed). I'm propertly closing the socket and context properly 
> afaik, but may be something is hanging for ~minutes before it kills 
> itself. Procexp shows no processed left from my app though.

The new process should succeed with connect() and send() even if the old one were still running.

> * Right, timeout is infinite. If I set a timeout, I get a res = 11 of 
> which neither I nor zmq_strerror can make any sense.

11 is EAGAIN. But you shouldn't be seeing it directly after a connect()!
A test case would be appreciated.

Regards,
Christian
_______________________________________________
zeromq-dev mailing list
zeromq-dev at lists.zeromq.org
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


More information about the zeromq-dev mailing list