[zeromq-dev] Reundance/Failover Capabilites?

Martin Sustrik sustrik at fastmq.com
Mon Oct 13 13:24:42 CEST 2008


Hi,

> I would posit that the correct behavior should be different:
> 
> 1. The application at initialization should be allowed to specify the 
> option for auto-reconnect which is true by default
> 
> 2. In case of disconnect, the appplication *should* be notified. 
> Clearly, you can further allow the application to turn of these 
> notifications, but I think that would tend towards baroqueness and 
> notification by default which the app can ignore is not a bad behavior.
> 
> In most trading application, auto and hidden queueing of messages would 
> be lead to disasters - like in an order sending application or even a 
> canonical application receiving market data. In an order sending message 
> if I cannot send order messages then I - as in the operator needs to 
> immediately notified so I can correct things.
> 
> For example in the receiving market data case, we delineate the 
> following cases of no-data:
> 
> 1. There is *no-data* by the generating process - the matching engine is 
> not generating prints and new prices - and I am getting no data - good.
> 2. There is *no-data* by the generating process - the matching engine is 
> not generating prints and new prices - and I am getting data - kinda 
> absurd and extremely rare scenario
> 3. There is data being generated by the generating process - but my 
> network is not receiving it and hence neither am I - bad
> 4. There is data being generated by the generating process - but my 
> network is receiving it however the internal broadcasting process is not 
> sending it to me - bad
> 5. There is data being generated by the generating process - and the 
> network is fine and my internal broadcasting process is sending it but 
> the "last meter" network is down so I cannot receive it - bad
> 
> As you imagine there can be a few other scenarios - but we would like to 
> know about #5 as soon as possible via disconnects notifications.

I haven't wanted to confuse people by too much detail... Anyway, the 
idea looks like this:

1. The primary goal is to separate management and business logic.
2. Management needs to know about disconnects. The old callback 
infrastructure stays in place. I am thinking of two separate 
notifications - 'disconnected' and 'reconnected'. You can do whatever 
you need in the handlers: start alarm, send sms's, kill or restart other 
applications etc.
3. However, business logic should be shielded from management issues 
(ie. from connect/disconnect events).
4. Thus sender should be allowed to publish messages even if it is 
disconnected. It will get no notification and the messages will be sent 
once it gets reconnected.
5. Receiver won't be notified about the disconnection. It will continue 
receiving messages from it's local queue (if there are any).
6. If disconnection causes message loss, receiver may opt to be notified 
about the fact. However, it will be notified only when it tries to 
receive missing messages, not immediately when disconnect occurs. 
Imagine 0MQ getting messages 1,2,3,etc. After getting message 5 
connection breaks. When the connection resumes, it'll get messages 
10,11,12,etc. What client application will see is following sequence of 
'messages': 1,2,3,4,5,loss_notification,10,11,12,...

Makes sense?

Martin



More information about the zeromq-dev mailing list