[zeromq-dev] Interruptted System Calls, EINTR, and more about POSIX signals than I ever wanted to know

Michael Compton michael.compton at littleedge.co.uk
Thu Mar 10 20:28:09 CET 2011


Hi all,

Sorry for the long subject name, I'm going to try and keep this short
because it gives me a headache, hehe.

This is some what connected to the following from the mailing list:
http://lists.zeromq.org/pipermail/zeromq-dev/2010-September/005822.html


While developing the clrzmq2 and using it on POSIX platforms with MONO,
I have came across some bothersome behaviour with blocking syscalls and
signals. Quite frequently the blocking calls, such as Recv and Poll,
would return -1 with ERRNO 4 (Interrupted System Call). 

The likely culprit for these signals is the Mono GC, though it has
handlers there to catch them. According to what I have learnt, blocking
syscalls will first return -1 when there is a signal and then the
handler will be invoked.

"When a system call is slow and a signal arrives while it was blocked,
waiting for something, the call is aborted and returns -EINTR, so that
the library function will return -1 and set errno to EINTR. Just before
the system call returns, the user program's signal handler is called."

Source 4.5 here: http://www.win.tue.nl/~aeb/linux/lk/lk-4.html

But the story also gets even more interesting, there is an SA_RESTART
flag which handlers can set so that the syscall will be continued after
the signal is handled, which incidentally the Mono GC sets, the problem
is that the SA_RESTART does not restart all syscall and is at best a
hint; it is also undocumented which syscalls it will restart.

So now I am left with the dilemma of what to do with these interruptions
to syscalls, do I try to handle it in the clrzmq2 binding by attempting
the call again, or do I let it bubble up to the user for them to deal
with. 

While Mono is likely to be causing this more often than would be
expected in other language environments I can imagine it is not a unique
situation to that particular runtime.

All in all I had to learn way more about syscalls and signals that I had
ever expected to.

I am open for correction on any of this and suggestions on how best to
deal with the situation.

Cheers,
Michael




More information about the zeromq-dev mailing list