[Dovecot] Scalability plans: Abstract out filesystem and make it someone else's problem

Seth Mattinen sethm at rollernet.us
Tue Aug 11 11:15:14 EEST 2009


Timo Sirainen wrote:
> On Aug 11, 2009, at 2:16 AM, Seth Mattinen wrote:
> 
>>> Show me a clustered filesystem that can guarantee that each file is
>>> stored in at least 3 different data centers and can scale linearly by
>>> simply adding more servers (let's say at least up to thousands).
>>
>> Easy, AFS. It is known to support tens of thousands of clients [1] and
>> it's not exactly new. Like supporting the quirks of NFS, the quirks of a
>> clustered filesystem could be found and dealt with, too.
> 
> I was more thinking about thousands of servers, not clients. Each server
> should contribute to the amount of storage you have. Buying huge
> storages is more expensive. Also it would be nice if you could just keep
> plugging in more servers to get more storage space, disk I/O and CPU and
> the system would just automatically reconfigure itself to take advantage
> of those. I can't really see any of that happening easily with AFS.

While that would be fancy, I don't think that level of integration would
be compatible with abstracting the filesystem per the original plan, so
I didn't consider it. I just considered robust, site independent,
scalable storage as you asked for. ;)

OpenAFS is worth a read, at least, to see what it offers and ideas you
could incorporate.
http://www.dementia.org/twiki/bin/view/AFSLore/GeneralFAQ

It focuses on "users" but you can pretend a user is really "server
running Dovecot". AFS also uses Kerberos. That alone would probably
disqualify its use for the purposes of simple Dovecot replication. I
picked on AFS because it closely matches what you were looking for in scale.


>> Key/value databases are hardly a magic bullet for redundancy. You don't
>> get 3 copies in different datacenters by simply switching to a
>> database-style storage.
> 
> Some (several?) of them can be somewhat easily configured to support
> that. (That's what their web pages say, anyway.)

Well, so can a global filesystem designed to do precisely that at the
block level. No advantage here.


>>> Clustered filesystems are also complex. They're much more complex than
>>> what Dovecot really requires.
>>
>> I mention it because you stated wanting to outsource the storage
>> portion. The complexity of whatever database engine you choose or
>> supporting a clustered filesystem (like NFS) is a wash since you're not
>> maintaining either one personally.
> 
> I also want something that's cheap and easy to scale. Sure, people who
> already have NFS/AFS/etc. systems can keep using Dovecot with the
> filesystem backends, but I don't think it's the cheapest or easiest
> choice. There's a reason why e.g. Amazon S3 isn't running on top of them.

S3 isn't really a fair comparison. There's Google FS too, but they're
both purpose built systems.

Now, keep in mind, I have not personally used AFS with Dovecot. My point
is to not dismiss building on a clustered file system just because it's
old and lacks sex appeal compared to the backend that Facebook uses.
UUCP is ancient too, but it still blows away stupid SMTP tricks many
people see as modern for disconnected endpoints.

~Seth


More information about the dovecot mailing list