[zeromq-dev] Too much ZeroMQ overhead versus plain TCP Java NIO Epoll (with measurements)

Bennie Kloosteman bklooste at gmail.com
Thu Aug 30 04:52:29 CEST 2012

> 2) Do people agree that 11 microseconds are just too much?
Nope once you go cross machine that 11 micro seconds become irrelevant .
The fastest exchange im aware of for frequent trading is 80 micro seconds
(+ transport costs) best case ,  so who are you talking to  and if your not
doing frequent trading than mili seconds are fine,  The rest of your system
and algorithms are far more crucial so IMHO your wasting time in the wrong
place.  For example you can use ZeroMQ to build an asynch pub sub solution
that can do market scanning in parallel from different machines a lot
faster than if did all the tcp/ip yourself.

ZeroMQ uses a different system for messages of less than 30 bytes eg they
are copied..  Im also unaware of any messages so small in the financial
industry .  Crossmachine will add the TCP/IP header  which  some transports
optomize out on the same machine, unless your looking at only at the IPC
case I would re run your tests with 100M  64 and 256 byte messages cross
machine .   As far as interprocess communication goes there are better ways
,  ( eg writing direct to the destination semi polled lockless buffer using
256/512 bit SIMD non temporal writes would blow away anything java can do )
   but they are all dedicated solutions   and dont play nicely with other
messages coming from the IP stack and that is the challenge for
communication frameworks .  if you keep reinventing the wheel with custom
solutions sure you can get better results but at what cost , will you
finish ..and obviously tuning your higher level algorithms gets better
results than the low level stuff.  Once you whole system with business
logic is sub mili second and that is not enough than I would revisit  the
lower level transport.

Lastly building a low latency message system on Java is dangerous  .Java
creates messages very quickly but if they are not disposed quickly eg under
peak load or some receivers are slower than  you get a big permanent memory
pool than you are in trouble - you wont see this in micro benches.  I had
one complete system that worked great and fast and than had huge GC pauses
and were talking almost seconds here , pretty much defeating any gains.  So
unless you manage the memory yourself  ( eg a byte array and serialise it
so the GC is not aware of it ) you are better of using a system to store
the messages outside of javas knowledge and C++ / ZeroMQ is a good  for

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20120830/3a09c050/attachment.htm>

More information about the zeromq-dev mailing list