[zeromq-dev] possible ZeroMQ bugs
Martin Sustrik
sustrik at 250bpm.com
Tue Jan 11 08:32:47 CET 2011
Hi Shannon,
I'm going to answer on libzmq level, pyzmq devs may elaborate on
interactions with python
> I think I may have found a few bugs, although I have a suspicion they
> may have been covered before. Hopefully there's I can point out at
> least one a couple new ones, though ;)
>
> My sample code is here: http://pastebin.com/4FC89sNc
>
> 1. In the code, if you replace tcp://127.0.0.1:7676 with
> http://127.0.0.1:7676 (i.e. if your fingers accidentally type "http"
> instead of "tcp"), you don't get an error.
I'm getting "protocol not supported" error.
> 2. In the code, if you replace tcp://127.0.0.1:7676 with
> tcp://localhost:7676, it says "No such device" which is kind of a
> mysterious error. "Hostnames are not allowed" might be better when
> you're dealing with tcp://.
The meaning of error is: With bind, you bind to a network interface, not
a host.
The error is ENODEV. Have a look at POSIX error codes. Feel free to
propose one that describes the problem better.
> 3. If you run the code as is, it'll hang forever in context.term(). I
> think it has to do with the fact that I have two threads. However,
> the code is very simple, and by the type context.term() is run, one
> thread has joined with the other. I don't see any reason why the code
> should hang.
Not to hang you have to:
1. Close all the sockets prior to calling zmq_term().
2. Set ZMQ_LINGER socket option to 0 on all sockets to prevent 0MQ to
try to send the pending data on zmq_term().
> By the way, it hangs hard--you have to kill -9 it.
That's interesting. Possibly one more stange interaction with python VM.
> 4. In general, the code doesn't respond well to keyboard interrupts.
> For instance, even if you comment out all the lines that start with
> thread in my code sample, hitting Cntl-C doesn't really do the right
> thing. I've seen it fail in 3 different ways, but I've never seen it
> raise a KeyboardInterrupt and exit the program. I'm just going to
> take a guess and say that it may be necessary to mark ZeroMQ's thread
> as a daemon thread, but I haven't looked at the code.
There have been extensive discussion about that one. The conclusion is
that POSIX signal system is fundamentally broken and 100% correct
handling of SIGINT is *impossible*.
Anyway, individual cases may be addressed to behave correctly in most
cases. Please, do report the problems you have seen.
Martin
More information about the zeromq-dev
mailing list