[zeromq-dev] zloop_timer() does not fire if added after zloop_start()...

Andy Ballingall TF ballingall at thefoundry.co.uk
Fri Dec 28 11:31:37 CET 2012


Hi Pieter,

On 27 December 2012 17:25, Pieter Hintjens <ph at imatix.com> wrote:
> Do you want to try fixing that bug, or shall I take a look?

It might be best if you decide how to proceed on this occasion, the reason
being that the API and zombie representation would need modification to
allow the unique identification of an existing timer for removal...

Were I to have a go, I'd probably:

   - change the zombies list to store pointers to the allocated timers
   (rather than the timers' 'arg').
   - change zloop_timer_end() so that it adds to the zombies list any timer
   whose arg is passed in the function - i.e. it would potentially add a
   number of items to the zombies list rather than just one
   - at line 375, add a check for a removed timer being in the zombies list
   (e.g. if zombies length > 0, then look for this timer in the zombies list
   and remove if found)
   - Modify the loop at line 420 so that it compares the timer pointers
   with the zombie pointers (which now are pointers to timers too)
   - (This is where you might have other ideas!) - Possibly add a new timer
   creation function - e.g. zloop_timer_new() which returns a pointer to the
   new timer or NULL on failure, and a new timer deletion function - e.g.
   zloop_timer_destroy() which expects the pointer returned from
   zloop_timer_new(). I'd then move the implementation of zloop_timer() into
   zloop_timer_new() and make zloop_timer() call zloop_timer_new(), and in the
   implementation of zloop_destroy(), it'd make an explicit check for the
   timer existing and not already being in the zombies list.

...but I'm aware that it's probably not my place to start messing around
with the API. Anyway - how does that sound?

Andy



>
> -Pieter
>
> On Thu, Dec 27, 2012 at 5:41 PM, Andy Ballingall TF
> <ballingall at thefoundry.co.uk> wrote:
>> On 27 December 2012 16:22, Pieter Hintjens <ph at imatix.com> wrote:
>>> Andy,
>>>
>>> You can trace this quite simply; print the tickless timer value before
>>> each zmq_poll in zloop. It should change when you add your new timer.
>>
>> Indeed, but even if the value is initially correct, the problem is
>> that the zombie deletion loop at the bottom of the reactor (line 420,
>> zloop.c) deletes any timers whose 'arg' matches the entry in the
>> zombies list, and the newly added timer's 'arg' matches the 'arg' of
>> the one I wanted to delete, so both will be removed by the loop,
>> rather than just the old one.
>>
>> Once the reactor returns to the top of the loop, there are no timers
>> left, which explains why it reverts to showing a poll time equivalent
>> to 1 hour.
>>
>> It's a by-product of the code deferring deletion of a timer when
>> zloop_timer_end() is called. If, after callling zloop_timer_end(), you
>> then immediately add another timer with the same 'arg' value, then the
>> reactor's zombie cleanup loop will blow away both.
>>
>> By the time the reactor gets back to recalculating the next delay (by
>> calling s_tickless_timer() on line 357 ), there are no timers in the
>> self->timers list, so a timeout of 1 hour is used.
>>
>> Andy
>>
>>>
>>> -Pieter
>>>
>>> On Thu, Dec 27, 2012 at 5:01 PM, Andy Ballingall TF
>>> <ballingall at thefoundry.co.uk> wrote:
>>>> Hi Pieter,
>>>>
>>>> On 27 December 2012 13:55, Pieter Hintjens <ph at imatix.com> wrote:
>>>>> On Thu, Dec 27, 2012 at 2:53 PM, Andy Ballingall TF
>>>>> <ballingall at thefoundry.co.uk> wrote:
>>>>>
>>>>>> Note how my timer registration succeeds, but the polling seems to
>>>>>> ignore it?
>>>>>
>>>>> Then it's because the tickless timer is being miscalculated; it should
>>>>> be re-calculated each time you add or remove a timer handler. If not,
>>>>> there's a bug.
>>>>
>>>> I agree with Apostolis below that the zloop code seems correct, with
>>>> respect to the calculation of the next timer expiry. Replicating my
>>>> scenario with test code would be quite a bit of work, so I took
>>>> another look at the zloop implementation again, and I think I can see
>>>> what's happening.
>>>>
>>>> Firstly, my timer mutations were occurring within a message handler,
>>>> and the mutation consisted of:
>>>> a) Remove existing timer
>>>> b) Create a new one.
>>>>
>>>> The old removed timer and the new timer both have the same 'arg' (in
>>>> my case this is an instance of my communications class).
>>>>
>>>> The reactor loop (in function zloop_start() ) first processes any poll
>>>> handlers, then at the end, checks if the zombies list has any timers
>>>> to clear.
>>>>
>>>> By the time the zloop_start() function gets to line 420 (where it
>>>> checks for zombies), then the state is:
>>>>
>>>> Zombies: 1 (arg='this')
>>>>
>>>> Timers: 2
>>>> 1 = old timer (arg='this', expiry=<old time>)
>>>> 2 = new timer (arg = 'this', expiry = <new time>)
>>>>
>>>> The loop at line 423 will blow away all the timers - including the one
>>>> I just added - as it matches the zombie arg.
>>>>
>>>> That explains why my newly added timer wasn't firing, and why the poll
>>>> timeout was set to 1 hour: It's because my newly added timer had been
>>>> removed almost immediately!
>>>>
>>>> If the timer deletion used a reference to the timer object rather than
>>>> 'arg' to reference it, then it would ensure that I could call
>>>> zloop_timer_end() and subsequently zloop_timer() from within a
>>>> handler, without the zloop_start() function removing that newly added
>>>> timer...
>>>>
>>>> In the end, this isn't a problem for me because I've changed my
>>>> implementation to use a heartbeat timer which never gets removed and
>>>> replaced, but I can see scenarios in the future where it would be good
>>>> to reset an existing timer, or delete and add a number of timers
>>>> inside a handler...
>>>>
>>>> Anyway, thanks for your help.
>>>>
>>>> Andy
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>>
>>>>> -Pieter
>>>>> _______________________________________________
>>>>> zeromq-dev mailing list
>>>>> zeromq-dev at lists.zeromq.org
>>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>>>
>>>>
>>>>
>>>> --
>>>> Andy Ballingall
>>>> Senior Software Engineer
>>>>
>>>> The Foundry
>>>> 6th Floor, The Communications Building,
>>>> 48, Leicester Square,
>>>> London, WC2H 7LT, UK
>>>> Tel: +44 (0)20 7968 6828 - Fax: +44 (0)20 7930 8906
>>>> Web: http://www.thefoundry.co.uk/
>>>>
>>>> The Foundry Visionmongers Ltd.
>>>> Registered in England and Wales No: 4642027
>>>> _______________________________________________
>>>> zeromq-dev mailing list
>>>> zeromq-dev at lists.zeromq.org
>>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>> _______________________________________________
>>> zeromq-dev mailing list
>>> zeromq-dev at lists.zeromq.org
>>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
>>
>>
>>
>> --
>> Andy Ballingall
>> Senior Software Engineer
>>
>> The Foundry
>> 6th Floor, The Communications Building,
>> 48, Leicester Square,
>> London, WC2H 7LT, UK
>> Tel: +44 (0)20 7968 6828 - Fax: +44 (0)20 7930 8906
>> Web: http://www.thefoundry.co.uk/
>>
>> The Foundry Visionmongers Ltd.
>> Registered in England and Wales No: 4642027
>> _______________________________________________
>> zeromq-dev mailing list
>> zeromq-dev at lists.zeromq.org
>> http://lists.zeromq.org/mailman/listinfo/zeromq-dev
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev



-- 
Andy Ballingall
Senior Software Engineer

The Foundry
6th Floor, The Communications Building,
48, Leicester Square,
London, WC2H 7LT, UK
Tel: +44 (0)20 7968 6828 - Fax: +44 (0)20 7930 8906
Web: http://www.thefoundry.co.uk/

The Foundry Visionmongers Ltd.
Registered in England and Wales No: 4642027
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20121228/2db9d310/attachment.htm>


More information about the zeromq-dev mailing list