[zeromq-dev] Python bindings strings and bytes

André Caron andre.l.caron at gmail.com
Thu Nov 26 05:30:19 CET 2015


> From the binding perspective this would be easiest. From a user perspective
it isn't

If you mean that this makes it harder for the application developer using
the binding because they have to figure out which encoding name to pass to
.decode, then I disagree.  Nothing is worse than a binding that does the
wrong thing and it's almost impossible for the binding of to do the right
thing under all circumstances (esp. without protocol support for built-in
message metadata telling you which encoding was used by the sender).

You should either pick a fixed encoding and stick to it or add message
metadata for tracking the encoding as part of your protocol -- but that
usually means picking a fixed encoding for the metadata anyways.  Many
modern protocols such as WebSockets[1] just pick UTF-8 w/o BOM as the only
supported encoding for text data.

[1]: See sections 5.8 and 8.1 from <https://tools.ietf.org/html/rfc6455>.

In the end, that's simplest for users.  If you are carrying binary data:
socket.read().  If you are reading text: socket.read().decode('utf-8').

André

On Thu, Nov 12, 2015 at 12:46 PM, Joe McIlvain <joe.eli.mac at gmail.com>
wrote:

> As you say, providing a fully idiomatic user experience is not always
> possible in generated bindings - this was also the case with Ruby when I
> was writing those bindings.
>
> In cases where there is a conflict between idiomatic user experience and
> predictability, I would argue that we *must* make the generated bindings
> behave in a way that wraps all reasonable CLASS-style C APIs in a
> consistent, predictable way.
>
> As I see it, the purpose of the bindings is to wrap the entire API in such
> a way that the result is predictable and (to the extent possible) generally
> safe from SEGVs or ABRTs.  Anything beyond this to provide idiomatic access
> is icing on the cake - we should do it if it's possible, but it is always
> second to predictability and safety.  In cases where there is a conflict,
> it often requires a hand-written higher level wrapper or convenience
> helpers to bridge the gap.
>
> This is the best we can do, I think.
>
> On Wed, Nov 11, 2015 at 12:47 PM, Michel Pelletier <
> pelletier.michel at gmail.com> wrote:
>
>>
>> On Wed, Nov 11, 2015 at 12:29 PM, Arnaud Loonstra <arnaud at sphaero.org>
>> wrote:
>>
>>>  From the binding perspective this would be easiest. From a user
>>> perspective it isn't. However if we need to please the user, automatic
>>> generation of bindings gets very difficult I guess.
>>>
>>>
>> To be fair I don't have a lot of skin in the game so take my opinion with
>> sufficient salt, the nice thing about bytes is that it's dirt simple.
>>
>>
>>> Could we embrace Python's Buffer Protocol?
>>> https://docs.python.org/3/c-api/buffer.html
>>> Also makes the exposed bytes mutable?
>>>
>>
>> Using buffer would be great, both bytes and bytearray implement it.  I
>> think it can be used with mmap objects as well. +1
>>
>> -Michel
>>
>>
>>>
>>> We could generate less cryptic exceptions indeed.
>>>
>>> Rg,
>>>
>>> Arnaud
>>>
>>> On 2015-11-11 17:42, Michel Pelletier wrote:
>>> > My personal opinion is that the API should use bytes, and only bytes,
>>> > and never return or accept any unicode objects.
>>> >
>>> > Its a bit brutal but then at least the rules are simple.  Pass
>>> > unicode, get a clear exception.
>>> >
>>> > On Wed, Nov 11, 2015 at 3:43 AM, Arnaud Loonstra <arnaud at sphaero.org
>>> > [3]> wrote:
>>> >
>>> >> This is a frequent issue dealing with Python but how do we want to
>>> >> deal
>>> >> with strings? Python strings are a bit cumbersome when dealing with
>>> >> C.
>>> >>
>>> >> For example to use the Zyre bindings in python one needs to do one
>>> >> of
>>> >> the following:
>>> >>
>>> >> > from zyre import Zyre
>>> >> > zn = Zyre(bMyZyreNode)
>>> >>
>>> >> or
>>> >>
>>> >> > zn = Zyre(MyZyreNode.encode(utf-8))
>>> >>
>>> >> This will work in both major Python versions.
>>> >> The current unittest uses:
>>> >>
>>> >> > z1 = Zyre(t1)
>>> >>
>>> >> which only works in Python 2.
>>> >>
>>> >> In Python 3 this excepts:
>>> >>
>>> >> Traceback (most recent call last):
>>> >>    File "test.py", line 6, in test_all
>>> >>      z1 = Zyre(t1)
>>> >>    File "/home/people/arnaud/src/zyre/bindings/python/zyre.py",
>>> >> line
>>> >> 129, in __init__
>>> >>      self._as_parameter_ = lib.zyre_new(args[0]) # Creation of
>>> >> new raw
>>> >> type
>>> >> ctypes.ArgumentError: argument 1: <class TypeError>: wrong type
>>> >>
>>> >> We could just use bytes for everything but it has some consequences
>>> >> and
>>> >> makes it a bit un-pythonic. For example the Zyre unittest tests for
>>> >> the
>>> >> type of a Zyre event:
>>> >>
>>> >> > self.assertEquals(e.type(), join)
>>> >>
>>> >> type() returns a Python string and not a bytes object.
>>> >>
>>> >> Actually I dont know of any other way as converting to a python
>>> >> string
>>> >> needs encoding. Anybody thoughts about this?
>>> >>
>>> >> Rg,
>>> >>
>>> >> Arnaud
>>> >> _______________________________________________
>>> >> zeromq-dev mailing list
>>> >> zeromq-dev at lists.zeromq.org [1]
>>> >> http://lists.zeromq.org/mailman/listinfo/zeromq-dev [2]
>>> >
>>> >
>>> >
>>> > Links:
>>> > ------
>>> > [1] mailto:zeromq-dev at lists.zeromq.org
>>> > [2] http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> > [3] mailto:arnaud at sphaero.org
>>>
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>
>>
>>
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20151125/56e79559/attachment.htm>


More information about the zeromq-dev mailing list