[zeromq-dev] Visibility into pipes of a socket
Stuart Brandt
stu at compuserve.com
Mon Sep 24 21:04:03 CEST 2012
Thanks for the response. Comments inline....
On 9/21/12 12:05 AM, Pieter Hintjens wrote:
> On Thu, Sep 20, 2012 at 9:40 PM, Stuart Brandt <stu at compuserve.com> wrote:
>
>> All thoughts welcome!
> Very roughly, trying to do this kind of thing is why it takes people
> months to write even basic protocols over TCP. No layering.
I'm a little puzzled on this statement. The API for 3.x has peer address
transiting both into and out of the 0MQ library layer. Exposing the
transport origin of a received message using the same philosophy as
zmq_ctx_set_monitor (an after-the-fact thing intended for monitoring and
operational concerns) doesn't strike me as a layering violation given
0MQ's existing APIs.
> Logging IP
> addresses in the server?
Absolutely! Operational auditing, monitoring, and troubleshooting is a
real PITA without it. Peer address info, or at least peer IP, is pretty
much standard monitoring info in most distributed apps I've seen.
> Measuring latency of servers?
Yes. Again this is one of those operational kinds of things. Measuring
server latency is a valuable tool for identifying issues that need to be
raised to operations staff.
>
> Most of what you want to achieve can be done easily above 0MQ by
> ignoring the physical network and talking application to application.
> Heartbeats from servers to clients. Application-level identifiers.
Agreed...but only if you assume a well behaved app-level that never runs
into version mismatch problems, or questionable/rogue peers, or word
boundary alignment problems, or any of a variety of application problems
that could pollute an app-level identifier. There are a lot of great
lessons in The Guide, and in this case the part on reliability that
cites application code as the worst offender when it comes to
reliability seems to apply. I'm not looking to guard against a hacker
working at the TCP or even ZMTP layer here. Protection against that is
well beyond the scope of 0MQ. I am, however, looking to make sure that
when some new client code makes it into production with a buffer
overwrite bug that my server can log/alarm with enough specifics to be
useful. Given the choice between
2012-09-24 14:25:51,644 ERROR - Unrecognized request from peer
192.0.0.1:12234
and
2012-09-24 14:25:51,644 ERROR - Unrecognized request from peer
#^2ffwi23r098vnasdf0
which would you prefer to use for quickly finding the client instance
that went bad?
More information about the zeromq-dev
mailing list