[zeromq-dev] Scalable data stores?

John D. Mitchell jdmitchell at gmail.com
Wed Jun 1 08:56:47 CEST 2011

On May 31, 2011, at 17:12 , Pieter Hintjens wrote:
> Well, the overall semantics are file based, i.e. objects will be much
> larger than memory size.

Then that sounds like you do really want a DFS approach.

I've worked on a project which used MogileFS and, while it basically worked, it was annoying and brittle and didn't really e.g. save money.

I haven't tried Ceph but I'm curious about it.

Also worked with Hadoop and I wouldn't recommend it for this use case. Similarly, I wouldn't use things like Cassandra for the actual storage of the big data blobs.

Are you in control of of the server nodes or are you trying to do this more decentralized? That potentially changes quite a bit.

> A use case for this project would be video streaming on a large scale,
> where video is being produced by several hundred/thousand nodes at one
> side, and consumed by several thousand/million at the other, but not
> necessarily in realtime. I.e. if everything is persisted to disk, and
> indexed, one can ask for archival or recent video using the same
> semantics as asking for live video.

This also reminds me of the Spotify discussion that went around recently:

Have fun,

More information about the zeromq-dev mailing list