[zeromq-dev] XRAP clarification: What character set are strings passed in?

Kevin Sapper kevinsapper88 at gmail.com
Tue Feb 2 10:48:25 CET 2016


The only reasonable choice for encoding is indeed UTF8, thus I agree to
explicitly state this in the spec.

As far as parameter values are concerned allowing binary content seems fine
to me. But keep in mind that they should only be used for sorting,
filtering or paging a collection of resources.

If you like to edit the spec checkout the github project
https://github.com/zeromq/rfc and send a PR.

//Kevin

2016-02-02 10:21 GMT+01:00 Tom Quarendon <tom.quarendon at teamwpc.co.uk>:

> When attempting to implement an XRAP client in java, one of the first
> issues you come across is what character set should strings be passed in in
> the XRAP messages.
>
> So the client needs to build a GET message, and it needs to put the
> resource name in it. The resource name is a string, so is naturally
> represented in Java as a String object, in Unicode.
>
> I’ve assumed that things that are naturally strings (resource names,
> content types, parameter names, metadata names, error strings) are actually
> passed in UTF8, but this isn’t specified.
>
>
>
> I think the specification needs to be explicit about what character set
> strings are passed in, and indeed which things are actually “strings” in
> that sense. I think it’s clear that the resource names, content types,
> parameter names, metadata names, error strings are actually intended as
> human readable strings. However, it gets a but greyer with
> parameter/metadata values, etag values and content bodies.
>
> For the body, you have to take into account the content type, but even for
> those content types that are textual (JSON, XML) currently you just have to
> assume that the encoding is UTF8 (I don’t **think** that is explicitly
> defined by application/json, but perhaps it is, in which case fine).
> However the body won’t always be textual, in the music example in the spec,
> actually retrieving the music track would most likely return binary data (I
> don’t think it would return a JSON with a BASE64/85 encoded piece of binary
> data in it would it?). So you can’t always assume that the content body is
> UTF8 text I don’t think.
>
> Etags are supposed to be opaque to the user, but it’s not clear whether
> this has to be opaque textual data, or whether this can be binary data.
> Maybe this is clearer in HTTP, since it’s passed in an HTTP header value,
> which are always ASCII anyway. I think this needs clarification in the case
> of XRAP, as there’s no particular reason it couldn’t be opaque binary data.
>
>
>
> Ditto parameter values. I can imagine sending binary valued parameters.
> Indeed I am. Clearly these could be expressed as BASE64 or BASE85 encoded
> values and hence text, at some slight cost, but I think this needs
> clarification too.
>
>
>
> Assuming there is some consensus, can the spec be edited to reflect?
>
> Thanks.
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20160202/720dfd62/attachment.htm>


More information about the zeromq-dev mailing list