[zeromq-dev] duped sockets and fork

Selim Ciraci ciraci at gmail.com
Fri Sep 13 22:43:00 CEST 2013


Hi Matt,

Thanks for your reply. I have actually found out about your patch after the
email. I have updated zmq to head from github and tried with my program.
The parent sockets seems to have closed. But the problem is every now and
then I get "no route to host" errors in zmq_send. This happens usually
when:
parent forks a child, child calls zmq_term(parent_context) does work and
then terimantes (closes its context).
parent in parallel uses parent_context, does work, learns the child has
terminated, forks a new child child2.
child2 zmq_term(parent_context) does work and then terimantes (closes its
context).
after child2 terminates parent cannot receive messages. Even though the
parent is active, zmq_send in the server fails with no route to host.

I have no idea why this fails. Any ideas what might be causing this?

Best,
Selim Ciraci



On Fri, Sep 13, 2013 at 1:32 PM, Matt Connolly <matt.connolly at me.com> wrote:

> Hi Selim,
>
> I have a similar (although simpler) scenario in mind. I discovered that
> closing a zmq context in a forked child context that was inherited from the
> parent would actually crash the parent (assert).
>
> What version of ZeroMQ are you using? If you’re using the head, I have
> added some code to allow this context to be terminated without causing an
> assert. There is a test use case in the file tests/test_fork.cpp.
>
> If not, can you try the development master from github and let me know if
> this helps?
>
> Regards,
> Matt
>
>
> On 12 Sep 2013, at 7:29 pm, Selim Ciraci <ciraci at gmail.com> wrote:
>
> > Hi,
> >
> > We have client/server program, the server is a process itself that is
> started separately from the client. The client on the other hand from time
> to time forks workers (we have to use fork because we are parallelizing a
> huge piece of code that is not thread safe at all). After forking the
> initial client also does some work and then checks its children.
> > The child processes, immediately after forking create a context and open
> connection to the server.
> > The childern themself can form new processes. These 'child-child'
> processes also create a context and open connections to the server.
> > Now the problem is that fork caries the parent sockets to the children.
> This means that a child-child-child-child... process can end up with too
> many open sockets such that opening more sockets are not possible In fact
> in this case, Zmq aborts with too many open sockets in signaler.cpp:230.
> The question is how can close parent sockets safely from child? (I cannot
> use zmq_ctx_destory with parent context)
> > I read about certain 'hacks' that read the fds from /proc and close them
> one by one. The problem with this solution is that I don't know which fds
> are sockets, I might end up closing an fd used for file operations.
> > Zmq sockets are void*. is there a way to get the low level descriptor
> from these void* sockets?
> > Any suggestions to fix this problem? I'm stuck with fork and I cannot
> change the architecture.
> >
> > Thanks!
> > Selim Ciraci
> >
> > _______________________________________________
> > zeromq-dev mailing list
> > zeromq-dev at lists.zeromq.org
> > http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20130913/1f8a2f8c/attachment.htm>


More information about the zeromq-dev mailing list