[zeromq-dev] (py)zmq, fork and cloexec

Christian Heimes lists at cheimes.de
Tue Jun 19 16:06:36 CEST 2012


Hello all,

I've done some tests with pyzmq to check if ZQM sets the CLOEXEC flag on
its file descriptors. During the tests I ran into several issues that
might be bugs in ZMQ or pyzmq.

My setup
  OS: Ubuntu 12.04 X86_64
  zmq: 2.2.1
  Python: 2.7.3 64bit
  pyzmq: 2.1.11

Python, zmq and pyzqm are self compiled with the usual options

=========================
#1 pyzmq + fork can crash
=========================

At first I tested how ZMQ handles fork(). Even the simple case with just
a context object raises an assertion in ZMQ. When I don't call
context.term() explicitly and pyzmq's Context.__del__() method run
zmq_term() for me, the script crashes with a segfault.


--- Script 1 ---
import zmq
import os

context = zmq.Context()
print os.fork(), os.getpid()
context.term()
---


Output with context.term():
$ python testzmq.py
622 619
0 622
Assertion failed: ok (mailbox.cpp:84)


Output without context.term():
python testzmq.py
777 773 15
0 777 15
Assertion failed: ok (mailbox.cpp:84)
Abgebrochen (Speicherabzug geschrieben)

The last message is German for "segfault (core dumped)"


GDB output:
(gdb) run testzmq.py
...
(gdb) bt
#0  0x00007ffff745e445 in __GI_raise (sig=<optimized out>) at
../nptl/sysdeps/unix/sysv/linux/raise.c:64
#1  0x00007ffff7461bab in __GI_abort () at abort.c:91
#2  0x00007ffff608ce4a in zmq::zmq_abort(char const*) () from
/opt/vls/lib/libzmq.so.1
#3  0x00007ffff609061d in zmq::mailbox_t::recv(zmq::command_t*, int) ()
from /opt/vls/lib/libzmq.so.1
#4  0x00007ffff609c1fc in zmq::reaper_t::in_event() () from
/opt/vls/lib/libzmq.so.1
#5  0x00007ffff608c036 in zmq::epoll_t::loop() () from
/opt/vls/lib/libzmq.so.1
#6  0x00007ffff608c104 in zmq::epoll_t::worker_routine(void*) () from
/opt/vls/lib/libzmq.so.1
#7  0x00007ffff60a702a in thread_routine () from /opt/vls/lib/libzmq.so.1
#8  0x00007ffff77ece9a in start_thread (arg=0x7ffff2a88700) at
pthread_create.c:308
#9  0x00007ffff751a4bd in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#10 0x0000000000000000 in ?? ()


==========================
#2 ZMQ doesn't set CLOEXEC
==========================

It seems to me that ZMQ doesn't set CLOEXEC on most of its file
descriptors. I did a quick grep over the C++ code and could just find
SOCK_CLOEXEC in src/ip.cpp.

ZMQ already has an extensive usage of file descriptors. Some may even
they it creates FDs like crazy. Without CLOEXEC all child processes
inherit the FDs from their parent process, therefore increasing the
total amount of FDs in the system and decreasing the amount of free FD
slots of the child process. This is an issue for processes that
fork()+execv() childs and use ZMQ to communicate with their children.

It may even be a security issue when the parent process is privileged
because a child process may be abke to directly temper with epoll fds
and pipes of the parent's ZMQ context.

This script shows that the parent's and child's FD count is both 15 (3
for std streams, 1 proc auxv, 2 eventpoll for ZMQ, 8 unix sockets for
ZMQ and 1 unknown).

--- script 2 ---
import zmq
import sys
import os
import subprocess

context = zmq.Context()

print "parent", len(os.listdir("/proc/%i/fd" % os.getpid()))

print subprocess.check_output(
    [sys.executable, "-c",
     """import os; print "child", len(os.listdir("/proc/%i/fd" %
os.getpid()))"""])
---

output:
parent 15
child 15

Christian



More information about the zeromq-dev mailing list