[zeromq-dev] ETIMEDOUT when doing a non-blocking recv
Chuck Remes
cremes.devlist at mac.com
Tue Jan 24 21:26:48 CET 2012
On Jan 6, 2012, at 3:23 PM, Martin Sustrik wrote:
> Hi Chuck,
>
>> I am periodically receiving ETIMEDOUT (errno 60) when doing a
>> non-blocking read from either a SUB socket or a DEALER/XREQ socket.
>> What can I assume from this error?
>>
>> My guess is that the socket has recently tried to connect to another
>> socket (in this particular case,*everything* is using 'inproc'
>> transport and they only bind once at startup) and it timed out.
>> Because zmq_connect() is async, we don't actually see the error until
>> we try to zmq_send()/zmq_recv() with that socket. At that point the
>> error is delivered.
>>
>> Is that assumption correct? If so, what can I do about it?
>>
>> OS => OSX libzmq => 2.1.11 ulimit -n => 400000
>>
>> At the time of the error, there has usually been about 2-3k xreq
>> sockets opened& closed with around 200 being open at any given
>> moment.
>
> 0MQ itself doesn't seem to produce this error. I.e. it must be received
> from the OS and forward via 0MQ to the user.
>
> Given that only transport you are using is inproc there's not much OS
> functionality involved so it shouldn't be that hard to track the source
> of the error down.
>
> My guess would be that it is generated by singaler_t class which
> contains a system socketpair on OSX platform. One of the OS functions
> called there is probably returning ETIMEDOUT for some reason.
>
> Unfortunately, I don't have a Mac so it's up to you to investigate.
Martin,
I see this ETIMEDOUT error quite a bit when my machine is under a little bit of load so I agree that it's probably some OSX kernel resource running out/low. (OSX is *not* a good choice for server workloads.)
Do you have any specific suggestions on what components of libzmq that I should instrument? I can add some printf's...
cr
More information about the zeromq-dev
mailing list