[zeromq-dev] Welcome to the "zeromq-dev" mailing list

Tim Crowder crowdert at yahoo-inc.com
Tue Jun 3 00:21:40 CEST 2014


Hi Jeremy-

You've set an SLA: 1M messages max.
If any client falls behind, you have to start saving messages to spinning rust.
So, you need to provide disk storage with sufficient bandwidth up-front,
just-in-case.

If you go the acknowledge route, then you have to make the server smart.
You still have to supply the persistent storage part, along with some logic
to track the clients. It has to know how many clients there are, and where
they are at, and make a number of arbitrary decisions.

If a client has a drive failure, or shuts down uncleanly, you'll still have to have
a way to recover. This may involve backing up in the queue for that client,
further than the last acknowledge. So, now the server has to store more than
you expected, *and* you need a way for clients to request from a specific point.

Given that, you can make your code much simpler from the outset, and just
store up to your SLA limit on the server (which tests that you actually have
capacity), and do explicit pulls from your clients.
The server is relatively simple/dumb:
  On message, store to disk, throw away blocks >1M messages past.
  On request, seek to the appropriate spot, read, and shove it in a socket.
The client is relatively simple:
  Request messages at last offset.
  On reply, process messages, advance offset.

You can optimize performance by keeping the last block in memory,
since most clients shouldn't be too far behind, most of the time.
Why add complexity?

Also, you don't have to have 10 connections for 10 streams.
Your request format could be {stream_name,offset, count} to fetch
a set of messages from the appropriate stream. Of course, you'll
need to identify which stream the replies are from, and the next offset.

Cheers!
.timrc

________________________________
From: zeromq-dev-bounces at lists.zeromq.org [zeromq-dev-bounces at lists.zeromq.org] on behalf of Jeremy Richemont [jrichemont at gmail.com]
Sent: Monday, June 02, 2014 5:59 AM
To: ZeroMQ development list
Subject: Re: [zeromq-dev] Welcome to the "zeromq-dev" mailing list

Hi Tim. I agree, a message can only be deemed to have been 'sent' if an acknowledgement for it is received. As soon as no ack is received the server must start backing up messages - but only for that client. It's perfectly possible to have one client up-to-date, one that's catching up from 200 messages ago and another catching up from 1000 messages ago. The server will need to know about each client - either because it's been configured to expect a certain set of clients or through some discovery mechanism whereby the client connects and tells the server about itself.

I kind of like the latter version as it's more flexible but it is more complex to implement.

The situation is also complicated by the fact that the server handles ten independent message streams. In a way a client may subscribe to all ten but in practice I think I'd keep it one to one. A system requiring all ten streams would create ten clients to subscribe to each.

Cheers;

Jeremy


On 30 May 2014 20:03, Tim Crowder <crowdert at yahoo-inc.com<mailto:crowdert at yahoo-inc.com>> wrote:
Hi Jeremy-

You might take a look at existing systems for how they handle it.
Apache Kafka, for instance, always maintains a backlog of messages,
via fast-append to logfiles. Clients keep track of the "offset" of the
last message they processed, and actively pull new messages from
the publisher.

Even if you know when the connection dropped, it's hard to know
which message was completely processed (vs delivered) last.
So, without client pulls, you need explicit client acknowledgement of
the last processed message.
This means that you have to keep saving/buffering messages until all
clients acknowledge them, then you can discard messages up to that point.

Cheers!
.timrc

________________________________
From: zeromq-dev-bounces at lists.zeromq.org<mailto:zeromq-dev-bounces at lists.zeromq.org> [zeromq-dev-bounces at lists.zeromq.org<mailto:zeromq-dev-bounces at lists.zeromq.org>] on behalf of Jeremy Richemont [jrichemont at gmail.com<mailto:jrichemont at gmail.com>]
Sent: Friday, May 30, 2014 7:21 AM
To: ZeroMQ development list
Subject: Re: [zeromq-dev] Welcome to the "zeromq-dev" mailing list

Thanks, Charles. I did, in fact, find that pattern. The problem is it does not match what I am trying to do. That pattern for when you have state + deltas. What I have is a continuous message stream which, once started to client x must be preserved even if client x dies for a bit (not forever of course, I put an SLA of 1 million messages/client) and then reconnects, every message it missed is replayed, in order, then the live stream resumes.

It needs to handle n clients, any of which may drop and reconnect so each one will need an independent message cache. PUB/SUB will not do for this because I may need to send messages 10 - 100 to client x on reconnect but 50 - 200 to client y.

Asking for state is a good idea - ask for missing updates in my case - but the question remains; how does the server know the client is no longer available and it must therefore start backing up messages from a PUB socket? The client can't tell it over OOB because it died already.

If I could just query PUB and get a list of clients plus a notification when one drops that'd solve the problem I think. But how to do that?

Jeremy

On 30 May 2014 15:00, Charles Remes <lists at chuckremes.com<mailto:lists at chuckremes.com>> wrote:
Take a look at the Clone pattern in the zguide.

http://zguide.zeromq.org/page:all#Reliable-Pub-Sub-Clone-Pattern

This might be what you need.

cr

On May 29, 2014, at 11:20 AM, Jeremy Richemont <jrichemont at gmail.com<mailto:jrichemont at gmail.com>> wrote:

>
> Hi. I am struggling to work out how to use zmq to implement the architecture I need. I have a classic publish/subscribe situation except that once client x has subscribed to a topic I need the topic data to be sent to it to be cached if the client dies and resent on reconnect. The data order is important and I can't miss messages should the client be offline for a while.
>
> The PUB/SUB pattern doesn't seem to know about individual clients and will just stop sending to client x if it dies. Plus I can't find out this has happened and cache the messages, or know when it reconnects.
>
> To try to get around this I used the REQ/REP pattern so the clients can announce themselves and have some persistence but this is not ideal for a couple of reasons:
>
> 1) The clients must constantly ask "got any data for me?" which offends my sensibilities
>
> 2) What happens if there's no data to send to client x but there is to client y? Without zmq I'd have had a thread per client and simply block the one with no data but I can't block client x without also blocking client y in a single thread.
>
> Am I trying to shove a round peg in a square hole, here? Is there some way I can get feedback from PUB saying 'failed to send to client x'? so I can cache the messages instead? Or is there some other pattern I should be using?
>
> Otherwise it's back to low level tcp for me...
>
> Many thanks;
>
> Jeremy
>
> _______________________________________________
> zeromq-dev mailing list
> zeromq-dev at lists.zeromq.org<mailto:zeromq-dev at lists.zeromq.org>
> http://lists.zeromq.org/mailman/listinfo/zeromq-dev

_______________________________________________
zeromq-dev mailing list
zeromq-dev at lists.zeromq.org<mailto:zeromq-dev at lists.zeromq.org>
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


_______________________________________________
zeromq-dev mailing list
zeromq-dev at lists.zeromq.org<mailto:zeromq-dev at lists.zeromq.org>
http://lists.zeromq.org/mailman/listinfo/zeromq-dev


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.zeromq.org/pipermail/zeromq-dev/attachments/20140602/605c1bf4/attachment.htm>


More information about the zeromq-dev mailing list