[zeromq-dev] PUB-SUB works, server restarts, PUB-SUB fails for existing workers.

Robert G. Jakabosky bobby at sharedrealm.com
Tue Aug 30 14:23:22 CEST 2011


On Tuesday 30, Pieter Hintjens wrote:
> On Tue, Aug 30, 2011 at 5:49 AM, alotsof <alotsof at gmx.net> wrote:
> > What surprises me though is that I can reduce the problem to one of
> > order of execution:
> > 
> > - spawn worker first, setup 0MQ second: reconnection works.
> > - setup 0MQ first, spawn worker second: reconnection fails.

The subprocess.Popen call most likely is using popen(), which uses fork() and 
that will cause the child process to inherit the parent's set of open file 
descriptors.  This means that the child process is keeping the server socket 
alive even when the parent closes the socket or dies.

One way to test this is to check if the server socket is still in the LISTEN 
state and owned by the worker process.
Run: netstat -pln | grep tcp

Look for the socket bound to your server's tcp port.  You can also grep for 
your worker's pid.

> 
> To be honest, your test set-up is not clean. You should remove the
> "start worker" code from the publisher and start the two processes
> explicitly from a single shell script.

Do what Pieter said and start the worker from a shell script as a separate 
process (i.e. not a child of the server).

Or if you really need to start the workers from the server process, the use a 
wrapper script to daemonize (fork + execve) the worker process.

-- 
Robert G. Jakabosky



More information about the zeromq-dev mailing list