[zeromq-dev] Resource temporarily unavailable
john skaller
skaller at users.sourceforge.net
Thu Jan 19 01:58:11 CET 2012
On 19/01/2012, at 5:35 AM, Chuck Remes wrote:
>>
>> Yes, I did that because I got EAGAIN. If I take out the loop on EAGAIN,
>> I get .. well I get EAGAIN (code 35 on OSX).
>
> It doesn't matter if you are using a REQ socket for blocking or non-blocking writes. For that socket type you must adhere to a strict send/recv/send/recv pattern. Don't do that.
Ok..
>> Note: I only get this problem when the client sends the message
>> to the server, so the server IS reading the message .. well,
>> its doing something in response to the message from the client.
>
> May I assume the server has connected via a REP socket?
The code is written in Felix, and it is intended to be the Felix
version of the Hello World example:
hwclient/hwserver
documented in the zguide. The Felix compiler generates C++, so I can
inspect the generated C++ code (to check that my binding is doing the "right thing").
It looks good to me: i.e. the zmq binding is right, and so is the use of it.
I'm hoping that this is not the case. The reason is that the alternative is a bug
in the Felix compiler or Felix run time system causing a corruption and that
will be extremely hard to track down!
This happened once before integrating Google's RE2 regex library and the problem
turned out to be leaving off a "hint" to the garbage collector on the library binding ..
and this one took almost a year to find (because the problem only occurred when
enough allocations had happened to trigger the GC, and none of my regression
tests do that) AND use Re2.
> This is a fairly common error. You might want to scan the guide again... don't worry, we've all had to read it 3 or 4 times before it sank in. :)
As above: the problem is that I'm actually *implementing* the guide examples :)
The loop on EAGAIN was only added after I got the resource temporarily available
message (EAGAIN) and the correct behaviour for that is to retry AFAIK...
If I should not retry, ZMQ should not issue that error code.
The C version of this code (from the zguide) works fine.
So there is a problem in the Felix
generated code somewhere. It is not impossible there is a memory corruption
and the error code is a spurious and lucky side effect of it.
--
john skaller
skaller at users.sourceforge.net
More information about the zeromq-dev
mailing list