[Dovecot] Scalability plans: Abstract out filesystem and make it someone else's problem

paulmon paulm at spider.org
Mon Sep 28 19:00:04 EEST 2009





On Mon, 2009-08-10 at 14:33 -0700, Seth Mattinen wrote:

> Nothing forces you to switch from maildir, if you're happy with it :)
> But if you want to support millions of users, it's simpler to distribute
> the storage and disk I/O evenly across hundreds of servers using a
> database that was designed for it. And by databases I mean here some of
> those key/value-like databases, not SQL. (What's a good collective name
> for those dbs anyway? BASE and NoSQL are a couple names I've seen.)


Timo, I've been thinking the same exact thing as you lately.  As mail starts
to move away from traditional "pop3" users to more online storage in the
form of webmail the scalability of maildir for large multi giabyte mailboxes
goes out the window, loading "cur" in that type of scenario takes WAY too
long.  Gmail on Maildir isn't possible.  I can't speak for anyone else buy
my users are moving into webmail, POP users are becoming rare.

My current thinking is a key/value store as you've proposed.  Something like
Hadoop components or Project Voldamort.  Voldamort might be a better fit
from what I've read. The main issue here is applications such as local
delivery as well as pop/imap access would need to be rewritten to support
this.  Obviously creating a Hadoop or Voldamort aware local delivery agent
means being able to stay away from writing a complete MTA, likewise if one
treats IMAP as the main way of accessing a mailbox (proxies for POP3 for
example) then a new local delivery agent and IMAPd with key/value "smarts"
would all that would be needed to create this system.

My current thinking if having the local delivery break messages up into
their component pieces, headers, from address, to address, spam scores, body
etc into various key:value relationships.  Combine this with the replication
support of systems such as Hadoop or Voldamort and you end up with a
massively scalable based on commodity hardware.  You get rid of RAID
completely, remove NFS servers and replace with a cluster of "beige boxes"
with ~4 drives each.  Redundancy is handled by the native replication in the
key:value application itself (Voldamort for example can replicate upto 3
times) on each machine, so yes, you would store a single message more than
once but if each of your "beige box" storage systems have 4*2TB drives your
cost of storage is far less than the cost of traditional NFS server
manufacturers.

Anyways, this is just something that's currently floating in my head...

Paul





-- 
View this message in context: http://www.nabble.com/Scalability-plans%3A-Abstract-out-filesystem-and-make-it-someone-else%27s-problem-tp24903458p25645652.html
Sent from the Dovecot mailing list archive at Nabble.com.



More information about the dovecot mailing list