[zeromq-dev] PUB-SUB filtering question
Alexey Ermakov
zee at technocore.ru
Mon Aug 16 16:29:37 CEST 2010
On Mon, Aug 16, 2010 at 6:15 PM, gonzalo diethelm <gdiethelm at dcv.cl> wrote:
>> The original java bindings I've written used byte arrays. If Java
>> binding developers have changed it to string and if Java string is
>> incapable of holding a binary zero, then it should be reported as bug
> IMO.
>
> The thing here is that a Java String CAN have a binary zero embedded
> into it (as far as I understand), although it might not be the best way
> to go about it.
Binary zero is representable in Java Strings, I've just been wrong
about the C API.
The problem is that Java Strings are currently converted to UTF-8 byte
sequences, which means that there is a whole set of possible prefixes
that can be used with C/Python/whatever API which would be impossible
to use from Java (as they would be impossible to encode as a String).
For example, any prefix starting with a byte larger than 0x80 would be
impossible to use from Java side.
Second, there's a bug in jzmq code. It's determining the prefix length
as "strlen (value)" (git 31a216ed1e98153ad0b526efa0d9aded24545b20,
Socket.cpp #223), which will lead to errors with zero-byte containing
strings.
> I don't think I have dealt with binary strings in Java before, so I am
> not sure what is the right(est) solution here. I will be happy to
> accommodate the best suggestions that come up in the forum.
There are no such things as binary strings in Java. If 0MQ
topics/identities are supposed to be raw binary values (which,
according to Martin, they are), I think that using byte arrays is the
only sane option. The only downside is that developers will have to
convert the string to bytes manually, but that's a good thing — they
will have to be aware of character encodings and specify them in their
API docs and the API will be much more interoperable.
More information about the zeromq-dev
mailing list