[zeromq-dev] ZeroMq 2.1.7 "memory leak"
jess.morecroft at gmail.com
Tue Feb 21 14:49:28 CET 2012
We've been happily using ZeroMQ for a good few months now and are
impressed with it's robustness and performance. We are however having
a problem with one of our servers "leaking" significant amounts of
memory over time and after a good few hours trawling the 0mq code I
think I might know why.
The "leaking" server acts as a single TCP port to ZeroMq universe
proxy for (Internet) clients hooking in to our server applications. It
is the only server in our environment that creates and binds to ZeroMQ
(sub) sockets, but does not necessarily zmq_poll on them. If there are
no client subscriptions mapping to a particular sub socket, we do not
add the 0mq socket to the zmq_pollitem_t array when next calling
zmq_poll. Such time as at least one client subscription maps to the
sub socket, we rebuild the zmq_pollitem_t array including the socket.
The first case is where the problem occurs - even though we're not
subscribed to a single topic on a socket, 0mq appears to buffer
incoming messages indefinitely. It is only when the socket is next
included in a call to zmq_poll that 0mq will actually purge all
buffered non-matching (ie. all) messages from the buffer.
In other words, 0mq appears to implement client-side filtering and
discarding of unwanted messages within the call to zmq_poll or
zmq_recv in the application thread, not in the 0mq I/O thread(s) as we
expected. The ironic side-effect of this, at least with our use of
"optimised" calls to zmq_poll, is that our proxy server leaks most
when no clients are connected!
My problem now is how to correct this. I see a few options:
1. Always pass all sockets to zmq_poll
2. Set high water-marks, which appear to be enforced in the I/O thread(s)
3. Use ZeroMq 3.1, which filters publisher-side and presumably
eliminates the scenario completely
None of these solutions are ideal. 1. involves a delicate rewrite of
some core messaging code and if I'm being really picky introduces
extra work (matching/purging) on our main application thread, 2. opens
up the potential for lost/discarded messages, and 3. is not an option
currently given 3.1's current beta status. I'm interested as to
whether anyone out there has experienced this kind of problem before
and has any alternative solutions for tackling things? My suspicion is
that a temporary solution of high water-marks, replaced by a proper
solution of using 0mq 3.1 once stable, is probably the way forward.
More information about the zeromq-dev