[Dovecot] Scalability plans: Abstract out filesystem and make it someone else's problem

Steffen Kaiser skdovecot at smail.inf.fh-brs.de
Tue Aug 11 17:32:06 EEST 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Mon, 10 Aug 2009, Timo Sirainen wrote:

> 4. Implement a multi-master filesystem backend for index files. The idea
> would be that all servers accessing the same mailbox must be talking to
> each others via network and every time something is changed, push the
> change to other servers. This is actually very similar to my previous
> multi-master plan. One of the servers accessing the mailbox would still
> act as a master and handle conflict resolution and writing indexes to
> disk more or less often.

What I don't understand here is:

_One_ server is the master, which owns the indexes locally?
Oh, 5. means that this particular server is initiating the write, right?

You spoke about thousends of servers, if one of them opens a mailbox, it 
needs to query all (thousends - 1) servers, which of them is probably the 
master of this mailbox. I suppose you need a "home location" server, which 
other servers connect to, in order to get server currently locking (aka 
acting as master for) this mailbox.

GSM has some home location register pointing to the base station currently 
managing the user info, because the GSM device is in its reach.

There is also another point I'm wondering about:
index files are "really more like memory dumps", you've wrote. so if you 
cluster thousends of servers together you'll most probably have different 
server architectures, say 32bit vs. 64bit, CISC vs. RISC, big vs. little 
endian, ASCII vs. EBCDIC :). To share these memory dumps without another 
abstraction layer wouldn't work.

> 5. Implement filesystem backend for dbox and permanent index storage
> using some scalable distributed database, such as maybe Cassandra. This

Although I like the "eventually consistent" part, I wonder about the 
Java-based stuff of Cassandra.

> is the part I've thought the least about, but it's also the part I hope
> to (mostly) outsource to someone else. I'm not going to write a
> distributed database from scratch..

I wonder if the index-backend in 4. and 5. shouldn't be the same.

===

How many work is it to handle the data in the index files?
What if any server forwards changes to the master and recieves changes 
from the master to sync its local read-only cache? So you needn't handle 
conflicts (except when network was down) and writes are consistent 
originated from this single master server. The actual mail data is 
accessed via another API.

When the current master does no longer need to access the mailbox, it 
could hand over the "master" stick to another server currently accessing 
the mailbox.

Bye,

- -- 
Steffen Kaiser
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iQEVAwUBSoGA6XWSIuGy1ktrAQKGjggAh9Yjzy2oFI2H8MS2rppm/ug2HWO+9PGX
aTRrzNzj2wTScAL1NrFZrN8Mlc7qK2YfH3rXDbM5Mcw/eC67VQ2P2XcetTY7h5XK
RxFqk5+h3Q06Jiwl0IFQyCxkRzs4bK6cZegjAfSViDfQTx8iQhvXHxioPLvIiFQH
D3lOd7+QUxOLKJyAxejjDM5ez/9OUFXZF9WeWrDGpQYES5HVNND3T288uBwWx5zJ
hwqQI8qR3Fwu9VRSDLpvCx1DjQWGOT7x6DfIaKg2j6IvvSTpH2dMsNg0M3YmLsvY
JyreDtqMlZDLclg00ELx0ORgQVHN5eQpOs/XgmFF0+YBQvAO6mtrUw==
=1GC8
-----END PGP SIGNATURE-----


More information about the dovecot mailing list