[zeromq-dev] Non-contiguous message thoughts

Martin Sustrik sustrik at 250bpm.com
Sun Mar 7 17:41:27 CET 2010


Guys,

Now I realised the non-contiguous message design would solve the problem 
I've been struggling with for several months already!

Specifically, it's the problem of when 0MQ itself needs to add some info 
to the message. For example, in multi-hop request/reply scenario each 
message should hold info about the route to the original requester so 
that reply can be delivered to the right place - in other, words, each 
node on the route should stick it's own identity to the message.

It's obvious that sticking additional info to the message doesn't play 
well with zero-copy. That's why I implemented complex design which 
extends the message exactly at the point when it is read from the socket.

Such a design creates a complete layering mess: Lowest layer - decoder 
of bytes from the socket - has to know about such a high-level concept 
as multi-hop request/reply. Also, it ceases to work when no socket is 
involved and messages are passed by a simple pointer (inproc transport).

Ugh!

However, if we are able to send non-contiguous messages (or message 
groups) sticking one more chunk to the message becomes a trivial matter. 
And it's still zero-copy!

Anyway, the above makes non-contiguous messages priority No.1. I am 
going to have a look at it ASAP.

Unfortunately we cannot avoid tweaking current wire format. For the 
description of current format have a look here:

http://api.zeromq.org/zmq_tcp.7.html

My proposal for tweaked format is: Insert a single byte between 
frame-length and frame-data. The byte is a bitmap called frame-flags.
At the moment we'll define a single flag called TO_BE_CONTINUED (propose 
a better name!) that will be be set for all the messages in the group 
except the last one. Single ungrouped message will be considered to be a 
single-message group. Frame-length is the length of the remaining part 
of the message, i.e. frame-flags length (1) + frame-data length.

Rationale:

* Adding a single byte to the framing format won't hurt performance. All 
the tests so far indicate that for very small messages (shorter than 
memory bus width) performance is more or less constant.

* Remaining 7 unused flags guarantee future extensibility of the 
protocol. Even if more than 7 bits is needed, we special semantics can 
be assigned to a particular flag meaning "there are more fields present 
before frame-data". I hope this won't be ever needed though.

* Single-message group should be the default as it's the most natural 
use case. Default value of the flag should be 0 rather than 1.

* Frame-length should account for all the remaining data in the message 
so that size-data pair can be thought of as the lowermost level of 0MQ 
protocol. Having it this way allows for applications ignorant of 
frame-flags. This is especially important for hardware devices that are 
required to parse as little of the header as possible to be able to keep 
processing rate extremely high. 100Gb Ethernet is not that far away etc.

Thoughts?

Martin




More information about the zeromq-dev mailing list