[zeromq-dev] Visibility into pipes of a socket

Stuart Brandt stu at compuserve.com
Mon Sep 24 21:04:03 CEST 2012


Thanks for the response.  Comments inline....

On 9/21/12 12:05 AM, Pieter Hintjens wrote:
> On Thu, Sep 20, 2012 at 9:40 PM, Stuart Brandt <stu at compuserve.com> wrote:
>
>> All thoughts welcome!
> Very roughly, trying to do this kind of thing is why it takes people
> months to write even basic protocols over TCP. No layering.
I'm a little puzzled on this statement. The API for 3.x has peer address 
transiting both into and out of the 0MQ library layer. Exposing the 
transport origin of a received message using the same philosophy as 
zmq_ctx_set_monitor (an after-the-fact thing intended for monitoring and 
operational concerns) doesn't strike me as a layering violation given 
0MQ's existing APIs.

> Logging IP
> addresses in the server?
Absolutely! Operational auditing, monitoring, and troubleshooting is a 
real PITA without it. Peer address info, or at least peer IP, is pretty 
much standard monitoring info in most distributed apps I've seen.

> Measuring latency of servers?
Yes. Again this is one of those operational kinds of things. Measuring 
server latency is a valuable tool for identifying issues that need to be 
raised to operations staff.
>
> Most of what you want to achieve can be done easily above 0MQ by
> ignoring the physical network and talking application to application.
> Heartbeats from servers to clients. Application-level identifiers.
Agreed...but only if you assume a well behaved app-level that never runs 
into version mismatch problems, or questionable/rogue peers, or word 
boundary alignment problems, or any of a variety of application problems 
that could pollute an app-level identifier. There are a lot of great 
lessons in The Guide, and in this case the part on reliability that 
cites application code as the worst offender when it comes to 
reliability seems to apply. I'm not looking to guard against a hacker 
working at the TCP or even ZMTP layer here. Protection against that is 
well beyond the scope of 0MQ.  I am, however, looking to make sure that 
when some new client code makes it into production with a buffer 
overwrite bug that my server can log/alarm with enough specifics to be 
useful.  Given the choice between
     2012-09-24 14:25:51,644 ERROR - Unrecognized request from peer 
192.0.0.1:12234
and
     2012-09-24 14:25:51,644 ERROR - Unrecognized request from peer 
#^2ffwi23r098vnasdf0
  which would you prefer to use for quickly finding the client instance 
that went bad?




More information about the zeromq-dev mailing list