On Mon, 2009-08-10 at 14:33 -0700, Seth Mattinen wrote:
Nothing forces you to switch from maildir, if you're happy with it :) But if you want to support millions of users, it's simpler to distribute the storage and disk I/O evenly across hundreds of servers using a database that was designed for it. And by databases I mean here some of those key/value-like databases, not SQL. (What's a good collective name for those dbs anyway? BASE and NoSQL are a couple names I've seen.)
Timo, I've been thinking the same exact thing as you lately. As mail starts to move away from traditional "pop3" users to more online storage in the form of webmail the scalability of maildir for large multi giabyte mailboxes goes out the window, loading "cur" in that type of scenario takes WAY too long. Gmail on Maildir isn't possible. I can't speak for anyone else buy my users are moving into webmail, POP users are becoming rare.
My current thinking is a key/value store as you've proposed. Something like Hadoop components or Project Voldamort. Voldamort might be a better fit from what I've read. The main issue here is applications such as local delivery as well as pop/imap access would need to be rewritten to support this. Obviously creating a Hadoop or Voldamort aware local delivery agent means being able to stay away from writing a complete MTA, likewise if one treats IMAP as the main way of accessing a mailbox (proxies for POP3 for example) then a new local delivery agent and IMAPd with key/value "smarts" would all that would be needed to create this system.
My current thinking if having the local delivery break messages up into their component pieces, headers, from address, to address, spam scores, body etc into various key:value relationships. Combine this with the replication support of systems such as Hadoop or Voldamort and you end up with a massively scalable based on commodity hardware. You get rid of RAID completely, remove NFS servers and replace with a cluster of "beige boxes" with ~4 drives each. Redundancy is handled by the native replication in the key:value application itself (Voldamort for example can replicate upto 3 times) on each machine, so yes, you would store a single message more than once but if each of your "beige box" storage systems have 4*2TB drives your cost of storage is far less than the cost of traditional NFS server manufacturers.
Anyways, this is just something that's currently floating in my head...
Paul
-- View this message in context: http://www.nabble.com/Scalability-plans%3A-Abstract-out-filesystem-and-make-... Sent from the Dovecot mailing list archive at Nabble.com.