[zeromq-dev] Proposal for a revised zlist API
Pieter Hintjens
ph at imatix.com
Sat Sep 6 14:46:35 CEST 2014
Also, sorted lists... it makes sense, however I think what we're
really wanting is a container provides both keyed and sequential
(sorted) access. Either a tree (there was a ztree class that I killed
a while back, as it had imported a lot of horrid code from elsewhere),
or a combination hash/ring.
On Sat, Sep 6, 2014 at 11:27 AM, Pieter Hintjens <ph at imatix.com> wrote:
> OK, after spending way too long on this, I'd like to revert all
> changes to zlist, deprecate that class, and make zring the new V3 list
> container, doing everything properly and cleanly. This makes it very
> easy for developers to spot old vs new constructs.
>
> I think we've gone too far down the rabbit hole. Anyone using lists in
> anger wants a doubly-linked list.
>
> Let's continue the careful and slow evolution of zring then, and use it in CZMQ.
>
> I'm still undecided on the void *user argument as it prevents us using
> standard destructors, which is such a fine thing. I'm going to look
> for other ways to do this.
>
> -Pieter
>
> On Fri, Sep 5, 2014 at 7:14 PM, Goswin von Brederlow <goswin-v-b at web.de> wrote:
>> On Fri, Sep 05, 2014 at 05:49:20PM +0200, Pieter Hintjens wrote:
>>> On Fri, Sep 5, 2014 at 3:56 PM, Goswin von Brederlow <goswin-v-b at web.de> wrote:
>>>
>>> > Just saw your commit.
>>>
>>> You only found 17 issues? :-) I was coding till 1am, surprising it
>>> worked at all.
>>>
>>> > 1) Small typos
>>>
>>> Fixed.
>>>
>>> > 2) zlist_purge does not destroy the cursor allowing zlist_next and
>>> > zlist_item access to freed memory.
>>>
>>> Nice catch, fixed.
>>>
>>> > 3) change cursor to "node_t **" but that would be internal.
>>>
>>> I'm not yet sure of that change, will try something else first
>>> (separate head/tail item as we have in zring, I think).
>>
>> The insert_before needs access to the previous node->next. You could
>> have the cursor always point to the previous node and handle the
>> special cases of an empty list or one with just one item. Using the
>> 2-stars trick avoids those special cases.
>>
>>> > 4) Now there is zlist_detach and zlist_delete. How about
>>> > zlist_detach_current and zlist_delete_current for the cursor based
>>> > flavour of the same? (needs 3)
>>>
>>> Not needed, you pass NULL as item and that does the job. I don't like
>>> long method names either... clumsy to use.
>>
>> Great, didn't think of that.
>>
>>> > 5) zlist_set_comparator for a container wide compare function.
>>>
>>> Tried that and decided against it, due to not wanting to break
>>> zlist_sort and not really seeing the benefit (in fact it's an extra
>>> line of code and extra risk of errors).
>>
>> I didn't add that for sort but for find. :) Find is probably done far
>> more often than sort and I use find in detach and delete too.
>> zlist_find can be used in zloop, zmsg, zpoller and zsys (some should
>> be rewritten to zlist_delete/detach) if that uses the comparator.
>>
>>> > 6) zlist_sort could allow NULL as second argument to use the container
>>> > compare (or compare by address, however useless that is :). (needs 5)
>>>
>>> Yes, that was my second idea, except it leaves the method messy and I
>>> still didn't see the benefit.
>>
>> It adds one line to the code if the comparator is initialized with an
>> physical equality. Didn't seem so bad.
>>
>>> > 7) zlist_find can be added without name collision. (needs 5)
>>>
>>> So searching on content rather than item value, yes, it could work.
>>
>> zloop searches by timer_id and socket. zsys searches on handler.
>> That's 3 uses already just in czmq itself.
>>
>>> > 8) zlist_detach and zlist_delete need to specify their affect on the
>>> > cursor.
>>>
>>> zlist_detach and zlist_delete are works in progress, I left this for
>>> today as I had other work. They should end up working and looking like
>>> their zring counterparts.
>>>
>>> > 9) I would suggest having zlist_detach and zlist_delete use container
>>> > compare function.
>>>
>>> OK, I'm partially convinced, however I don't have a use case, this is
>>> engineering because "it can work" which I'd rather avoid. So far
>>> comparison on the item void * has always worked fine.
>>
>> - zsys_close does detach sockets by their handle instead of item.
>> - zloop deletes reader by their socket, poller by socket or fd
>>
>>> > 10) I would suggest having zlist_detach and zlist_delete use zlist_find
>>> > + zlist_*_current. (needs 7, solves 9 implicitly)
>>>
>>> Again, problem statements first, solutions second.
>>
>> The problem is code duplication vs. reuse. Both delete and detach
>> currently implement their own find loop which would be identical to
>> zlist_find. 3 times the same code is 3 times the work to maintain.
>>
>>> > 11) zlist_t still has a "bool autofree;". The old behaviour could be
>>> > achieved by having zlist_autofree set a duplicator (strdup) and
>>> > destructor (something that calls free) instead.
>>>
>>> Indeed. I'd randomly thought that and didn't do it yet. My goal today
>>> was the general shape of the API and keeping existing code from not
>>> breaking.
>>>
>>> > 12) Can we add the "void *user" argument to czmq_destructor and
>>> > czmq_comparator? That would be realy usefull for keeping items in 2
>>> > container like zcertstore does among other things.
>>>
>>> Haven't understood the problem yet enough to evaluate the addition.
>>> I'll get there eventually, if you can wait.
>>
>> I came to this idea because I needed it for my PPPP protocol
>> implementation. The timeout handler for an item needs access to the
>> overall peer structure (to send heartbeats to the peers socket) and
>> protocol structures (to remove dead peers after too many timeouts). I
>> didn't want to store the same identical pointer in thousands if not
>> millions of items. Storing it once in the container seemed smarter and
>> uses far less code.
>>
>> Check out how I changed the zcertstore to use a container wide
>> destructor on the zlist for a simple use case. Certificates are kept
>> both in a list and a hashtable. The list is setup so that deleting a
>> cert from the list will automatically delete the cert from the
>> hashtable too and then destroy it. This can also be setup that
>> deleting from either container deletes from both.
>>
>>> > 13) zlist_first, zlist_next and zlist_last each duplicate zlist_item
>>> > at the end.
>>>
>>> Same in zring. I'll fix that. zlist_head too.
>>>
>>> > 14) zlist_insert_before could be added without name collision. (needs 3)
>>>
>>> I've added and removed that twice now, and still don't see a use case for it.
>>
>> Inserting an item into a sorted list. More importantly merging 2
>> sorted lists into one:
>>
>> item_t *item = zlist_first (list2);
>> while (item) {
>> zlist_find (list1, item);
>> zlist_insert_before (list1, item);
>> item = zlist_next (list2);
>> }
>>
>> zlist_insert_before can be avoided if there is a zlist_peek_next
>> (return the next item without advancing the cursor). But that is less
>> elegant. But we can leave that out. I don't have a use for that for
>> zlist.
>>
>>> > 15) zlist_insert(_after) could be added without name collision.
>>> > 16) zlist_insert_sorted could be added without name collision. (needs 3+5+7)
>>>
>>> OK, enough! :-) It is painful to wrangle APIs, and it gets worse
>>> whenever we add stuff we do not absolutely need. Please come with
>>> problems that we cannot reasonably solve using the current APIs, or
>>> places where we can make valuable improvements.
>>
>> Messages can get lost if a peer reconnects. In the PPPP protocol this
>> gets detected and the lost messages get retransmitted. What happens in
>> my implementation is that I have a zlist for incoming messages. When a
>> message is received that isn't the next message it is inserted into
>> the incoming list instead of passed to the application. Then when the
>> lost message is retransmitted that is passed to the application and
>> the incoming list needs to be processed in order of message sequence
>> number. By using zlist_insert_sorted that is made simple.
>>
>> I need zlist_insert_sorted or zlist_insert_before for this. Either one
>> will do. The reason I made zlist_insert_before was just so I could
>> implement zlist_insert_sorted trivially.
>>
>> Arguably this could use a new class for a sorted list/queue/tree that
>> uses a heap or binary tree for efficiency. But my use case should
>> happen rarely and only for a small number of messages in the list so
>> it didn't seem worth writing a whole new class for it just to be a bit
>> more efficient.
>>
>> If you are set against it though I can switch this over to use a
>> zhash. Processing would then try to find the next sequence number in
>> the zhash, if found send to app, delete and repeat. Actually thinking
>> about it that might actually be better all around. Would have O(1)
>> complexity instead of O(n) even if n is bound to be small in this case.
>>
>> Ok, you have me convinced. I will rewrite my use case for this.
>> Sometimes just writing about use cases and explaining them gives you
>> better ideas.
>>
>>> > 17) zlist_last is a problem for 4 since it updates the cursor.
>>> > Consider this stupid way to purge the list in reverse order:
>>> >
>>> > while (zlist_last (list)) zlist_delete_current (list);
>>>
>>> zlist_purge (). I don't really care about stupid use cases.
>>>
>>> However, zlist_last uses the list tail. Any delete/detach has to
>>> update the tail.
>>
>> Agreed. The list tail is also needed for append. But zlist_last
>> doesn't need to change the cursor (unless it points to the last item
>> and then it changes to NULL or head).
>>
>>> > If someone needs to process a zlist from the back like that then they
>>> > should be using a zring instead. So I think this would be safe.
>>>
>>> Indeed. There are some basic styles we use lists for, and that's all I
>>> really care about. Weird constructs don't count as valid use cases
>>> since they're usually wrong in other subtle ways anyhow.
>>
>> Is that an agreement to make zlist_last() leave the cursor alone then?
>> I can't think of any sane use of zlist_last() changing the cursor,
>> which would introduce a method do detach/delete the last element in a
>> list at the cost of O(n) complexity.
>>
>> As I see it leaving the cursor alone breaks nothing and keeps things sane.
>>
>>> > 18) You've changed the API, the cursor behaviour, slightly already. I
>>> > wonder if any code relies on the destructive cursor behaviour of
>>> > zlist_push, zlist_append, zlist_remove (and any others I forgot). It
>>> > is possible to have code like this:
>>>
>>> I'm going to deprecate that notion that zlist_next() makes sense
>>> without a zlist_first(). It's errorprone and silly to have magical
>>> assumptions about list state that you can't visually verify, and can't
>>> copy/paste as a block.
>>>
>>> -Pieter
>>
>> No complains from me. Just wanted to point out the change. Partly as
>> argument to allow the change in 17 too.
>>
>> MfG
>> Goswin
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
More information about the zeromq-dev
mailing list