[zeromq-dev] (py)zmq, fork and cloexec
Min RK
benjaminrk at gmail.com
Tue Jun 19 20:17:41 CEST 2012
On Jun 19, 2012, at 7:06 AM, Christian Heimes <lists at cheimes.de> wrote:
> Hello all,
>
> I've done some tests with pyzmq to check if ZQM sets the CLOEXEC flag on
> its file descriptors. During the tests I ran into several issues that
> might be bugs in ZMQ or pyzmq.
>
> My setup
> OS: Ubuntu 12.04 X86_64
> zmq: 2.2.1
> Python: 2.7.3 64bit
> pyzmq: 2.1.11
>
> Python, zmq and pyzqm are self compiled with the usual options
>
> =========================
> #1 pyzmq + fork can crash
> =========================
>
> At first I tested how ZMQ handles fork(). Even the simple case with just
> a context object raises an assertion in ZMQ. When I don't call
> context.term() explicitly and pyzmq's Context.__del__() method run
> zmq_term() for me, the script crashes with a segfault.
This is actually due to pyzmq calling close/term on gc, and was fixed in pyzmq master just this week.
-MinRK
>
>
> --- Script 1 ---
> import zmq
> import os
>
> context = zmq.Context()
> print os.fork(), os.getpid()
> context.term()
> ---
>
>
> Output with context.term():
> $ python testzmq.py
> 622 619
> 0 622
> Assertion failed: ok (mailbox.cpp:84)
>
>
> Output without context.term():
> python testzmq.py
> 777 773 15
> 0 777 15
> Assertion failed: ok (mailbox.cpp:84)
> Abgebrochen (Speicherabzug geschrieben)
>
> The last message is German for "segfault (core dumped)"
>
>
> GDB output:
> (gdb) run testzmq.py
> ...
> (gdb) bt
> #0 0x00007ffff745e445 in __GI_raise (sig=<optimized out>) at
> ../nptl/sysdeps/unix/sysv/linux/raise.c:64
> #1 0x00007ffff7461bab in __GI_abort () at abort.c:91
> #2 0x00007ffff608ce4a in zmq::zmq_abort(char const*) () from
> /opt/vls/lib/libzmq.so.1
> #3 0x00007ffff609061d in zmq::mailbox_t::recv(zmq::command_t*, int) ()
> from /opt/vls/lib/libzmq.so.1
> #4 0x00007ffff609c1fc in zmq::reaper_t::in_event() () from
> /opt/vls/lib/libzmq.so.1
> #5 0x00007ffff608c036 in zmq::epoll_t::loop() () from
> /opt/vls/lib/libzmq.so.1
> #6 0x00007ffff608c104 in zmq::epoll_t::worker_routine(void*) () from
> /opt/vls/lib/libzmq.so.1
> #7 0x00007ffff60a702a in thread_routine () from /opt/vls/lib/libzmq.so.1
> #8 0x00007ffff77ece9a in start_thread (arg=0x7ffff2a88700) at
> pthread_create.c:308
> #9 0x00007ffff751a4bd in clone () at
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
> #10 0x0000000000000000 in ?? ()
>
>
> ==========================
> #2 ZMQ doesn't set CLOEXEC
> ==========================
>
> It seems to me that ZMQ doesn't set CLOEXEC on most of its file
> descriptors. I did a quick grep over the C++ code and could just find
> SOCK_CLOEXEC in src/ip.cpp.
>
> ZMQ already has an extensive usage of file descriptors. Some may even
> they it creates FDs like crazy. Without CLOEXEC all child processes
> inherit the FDs from their parent process, therefore increasing the
> total amount of FDs in the system and decreasing the amount of free FD
> slots of the child process. This is an issue for processes that
> fork()+execv() childs and use ZMQ to communicate with their children.
>
> It may even be a security issue when the parent process is privileged
> because a child process may be abke to directly temper with epoll fds
> and pipes of the parent's ZMQ context.
>
> This script shows that the parent's and child's FD count is both 15 (3
> for std streams, 1 proc auxv, 2 eventpoll for ZMQ, 8 unix sockets for
> ZMQ and 1 unknown).
>
> --- script 2 ---
> import zmq
> import sys
> import os
> import subprocess
>
> context = zmq.Context()
>
> print "parent", len(os.listdir("/proc/%i/fd" % os.getpid()))
>
> print subprocess.check_output(
> [sys.executable, "-c",
> """import os; print "child", len(os.listdir("/proc/%i/fd" %
> os.getpid()))"""])
> ---
>
> output:
> parent 15
> child 15
>
> Christian
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
More information about the zeromq-dev
mailing list