Hi Bill,  

Could you use the socket monitor and poll the output socket of that for connection events?
http://api.zeromq.org/4-2:zmq-socket-monitor

Also I love the idea of a middle ground guide to zeromq guts, I have read the whitepaper you mentioned in the past but its limited. Also things like this go out of date so quickly if someone is not dedicated (paid) to keep them updated.
I would actually quite enjoy getting back to basics and document the operation from understanding the code but I fear the company I work for would not be to happy as the weeks roll into months :)

James 



On Thu, Aug 23, 2018 at 9:55 PM Bill Torpey <wallstprog@gmail.com> wrote:
One more thing…

I wonder if it is at all feasible to expand the definition of the events that zmq_poll returns to include connection-related events, in addition to just I/O events.

It could be very useful to get notified not only if there is data to be received on a socket, but also if a connect or disconnect event occurred on the socket — i.e., something like ZMQ_POLLCONN in flags.

For example, one of the problems with the workaround for #2267 (https://gist.github.com/hintjens/7344533) is that the code needs to know that it needs to call zmq_poll(sub…, but in the real world those two sockets are likely in different processes and there is no way to know. However, if zmq_poll returned on connection events, that could trigger calling zmq_poll on the affected socket(s).  (It looks like process_commands is only run on entry to zmq_poll, so it would presumably be sufficient simply to return from the zmq_poll and then call it again, as is typical usage).

Any thoughts from the community on this?  Is it a good idea?  Is it even possible?

Thanks in advance for any comments.


On Aug 23, 2018, at 4:33 PM, Bill Torpey <wallstprog@gmail.com> wrote:

Thanks, James!

That’s a very informative thread.  This whole business with process_commands and the way ZeroMQ handles resources seems to be a classic case of a “leaky abstraction”: i.e., it all “just works” — except when it doesn’t.

In my particular case, this problem turns out to be a bit of a “red herring” — it’s an edge case that was exposed in a specific test program, under an unusual set of conditions (e.g., peer processes connecting and disconnecting repeatedly combined with code that only receives, but never sends, messages).  I wouldn’t expect this situation to come up in production — on the other hand, we need to understand what the behavior is under unusual conditions, and do something about it if that behavior can have negative consequences.  In any event, I’ve implemented a (presumably superfluous) workaround in my library code to avoid this problem by calling zmq_getsockopt that appears to work.  (Of course, for others this is not necessarily a red herring, but a real problem).

But it took a while, and a fair amount of effort, to understand what was going on here.  It would be nice if there was some middle ground between just treating ZeroMQ as a “black box” and stepping through the code line-by-line to figure out how it works.  So far, I haven’t seen anything like that, but if anyone in the community knows of any resources that might help peel back the cover a little bit, I would be very grateful for any recommendations.  (FWIW, the best I’ve found so far is http://zeromq.org/whitepapers:architecture, but that doesn’t address the whole process_commands business). 

Last but not least, ZeroMQ is amazing stuff, and I don’t mean to sound ungrateful to or critical of the smart people who built and maintain it, but it’s part of my job to beat the stuffing out of any software that the business is going to depend on and expose any problems. 

On Aug 23, 2018, at 10:58 AM, James Harvey <jamesdillonharvey@gmail.com> wrote:

As a side note, having some method to call process_commands while idle would also fix the memory usage issues encountered when using ZMQ_CONFLATE and not reading from the socket.


I added documentaion to periodically call getsockopt with ZMQ_EVENTS but that still requires work on the users side.

On Thu, Aug 23, 2018 at 3:29 PM Bill Torpey <wallstprog@gmail.com> wrote:
I’m posting this here since not everyone on the list will necessarily see the Github issue, and I’m interested in getting as much feedback as possible.

The issue in question ( https://github.com/zeromq/libzmq/issues/3186 ) has to do with finding a good way to trigger process_commands on inactive sockets.  In our tests, we see real-time memory utilization steadily increase for processes that only subscribe to data when other processes connect and disconnect from their publisher sockets.  The root cause of the problem seems to be that the publisher sockets never get a chance to clean up if we never call zmq_send etc.

The Github issue goes into some detail on potential workarounds, along with their drawbacks.  I would very much appreciate any suggestions that the group may have on how to deal with this problem — I can’t believe that we’re the first to run into it.

Thanks in advance for any suggestions!

_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev
_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev


_______________________________________________
zeromq-dev mailing list
zeromq-dev@lists.zeromq.org
https://lists.zeromq.org/mailman/listinfo/zeromq-dev