That’s a very informative thread. This whole business with process_commands and the way ZeroMQ handles resources seems to be a classic case of a “leaky abstraction”: i.e., it all “just works” — except when it doesn’t.
In my particular case, this problem turns out to be a bit of a “red herring” — it’s an edge case that was exposed in a specific test program, under an unusual set of conditions (e.g., peer processes connecting and disconnecting repeatedly combined with code that only receives, but never sends, messages). I wouldn’t expect this situation to come up in production — on the other hand, we need to understand what the behavior is under unusual conditions, and do something about it if that behavior can have negative consequences. In any event, I’ve implemented a (presumably superfluous) workaround in my library code to avoid this problem by calling zmq_getsockopt that appears to work. (Of course, for others this is not necessarily a red herring, but a real problem).
But it took a while, and a fair amount of effort, to understand what was going on here. It would be nice if there was some middle ground between just treating ZeroMQ as a “black box” and stepping through the code line-by-line to figure out how it works. So far, I haven’t seen anything like that, but if anyone in the community knows of any resources that might help peel back the cover a little bit, I would be very grateful for any recommendations. (FWIW, the best I’ve found so far is http://zeromq.org/whitepapers:architecture
, but that doesn’t address the whole process_commands business).
Last but not least, ZeroMQ is amazing stuff, and I don’t mean to sound ungrateful to or critical of the smart people who built and maintain it, but it’s part of my job to beat the stuffing out of any software that the business is going to depend on and expose any problems.
As a side note, having some method to call process_commands while idle would also fix the memory usage issues encountered when using ZMQ_CONFLATE and not reading from the socket.
I added documentaion to periodically call getsockopt with ZMQ_EVENTS but that still requires work on the users side.
I’m posting this here since not everyone on the list will necessarily see the Github issue, and I’m interested in getting as much feedback as possible.
The issue in question ( https://github.com/zeromq/libzmq/issues/3186 ) has to do with finding a good way to trigger process_commands on inactive sockets. In our tests, we see real-time memory utilization steadily increase for processes that only subscribe to data when other processes connect and disconnect from their publisher sockets. The root cause of the problem seems to be that the publisher sockets never get a chance to clean up if we never call zmq_send etc.
The Github issue goes into some detail on potential workarounds, along with their drawbacks. I would very much appreciate any suggestions that the group may have on how to deal with this problem — I can’t believe that we’re the first to run into it.
Thanks in advance for any suggestions!
zeromq-dev mailing list
zeromq-dev mailing firstname.lastname@example.org