[Dovecot] How can we horizontally scale Dovecot across multiple servers?

Thu Nov 3 12:42:36 EET 2011

On 31/10/2011 11:28, Felipe Scarel wrote:
> Quick question about the usage of DRBD: I'm thinking of a setup on my
> organization here (15k+ users, 4TB of email data), but I'm holding back on
> the clusterization due to the high volume of data.
>
> Using DRBD would implicate mirroring those 4TB of data across all cluster
> nodes? If yes, I might go with a SAN-based solution, though I haven't

I don't the technique with DRBD is something like having pairs of
machines, each of which is a backup for the other.  There were some old
notes on the Dovecot website about such a setup? 

Roughly I seem to recall that each pair of machines ran two virtual
machines, each of which ran active on one of the nodes each, but could
migrate to the other if needed.  Add a bunch of such paired nodes to get
to the performance you require and put a dovecot proxy instance in front
of the whole lot

In contrast the SAN solution uses a clustered filesystem (opinion varies
on which performs best) and then in theory every machine has access to
every mailbox.  In practice access to the SAN is relatively slow
compared with local storage, so the technique seems to be to store
indexes on the local machine and then using the front end proxy to be
somewhat "sticky" in returning users to the same backend node so that
the indexes can be re-used and not rebuilt

The DRBD solution offers local disk access speed to the node and would
on the surface give far faster performance (if disk were the limiting
issue).  However, it's likely to be more complex to maintain and manage
and without buying licences you get only failover between pairs of
machines.  The SAN solution in theory looks like perfect scale up, big
backend and just add more backend IMAP nodes as you need them, and all
the clever stuff moves to the frontend load balancer to be "sticky" and
obviously that's your main maintenance problem.

However, based on evidence from users of big systems, IO is likely to be
your main bottleneck and so just theoretically, the SAN will only scale
as far as it doesn't run out of IOs... Using local disk for indexes
would tend to reduce the amount of IOs needed (from the SAN) very
dramatically, but you still have some limit out there and it's a
question of whether you will reach it?  DRBD has theoretical infinite
scale out because each time you add another pair you get more IO as well
as more CPU

I don't have the fortune to have anything like the volume of users you
have so I have no opinion to offer... However, I think the above
accurately summarises your options.  Others might help clarify the
likely bounds on performance of each solution and maintenance headaches
(eg some have had problems with maildir mounted on OCFS/GFS2 and fixed
that by moving to dbox, etc)

Please report on your results!  Good luck

Ed W