Hey there,

I am working on migrating a Dovecot system with + 50.000 users / 5TB on mail data to a new set-up that is better scalable. But I found it difficult to find a good solution. In this email I lay out what I learned, in the hope that others can profit. But also others can help me on this design before I actually start migrating.

My aim is to get a clustered file system setup for Dovecot. But with the additional constraint of not having to pay on a per-mailbox basis to some vendor.

Why a clustered file system? I believe this yields the best solution. Your machines are now stateless: consuming services. And scaling or recovery is now relatively easy.

Why the constraint of not paying per-mailbox? If you did pay a fee per mailbox; you probably would be better of just using a managed mail service altogether.

High-level
========

We are going to use ObjectiveFS as our clustered file system (not involved with them in any way). They use Fuse to make it behave like a POSIX filesystem. Due to the way they compact files that belong together it handles a large number of small files relatively fine. The great thing is: we can mount this filesystem on multiple machines. This piece of software is proprietary, but it's price is very reasonable, and it is a fixed monthly cost. We can use S3, Google Cloud Storage, or others as the actual storage system.

We store metadata about the mailboxes in a database that is available to all machines. We can use any managed database service for this. We explicitly set the proxy_host for each user in this database. Thus every user gets routed to the same machine. We make sure to distributed the users among the available machines.

All traffic is received by a TCP load balancer. This will forward the traffic to the underlying machines. Ideally this is a managed cloud service. I think there are two valid choices. But I prefer the first one:
That traffic is now routed to N-machines. Each of those machines runs Dovecot. And Dovecot is configured with Dovecot Proxy (not to be confused with HaProxy's PROXY protocol). Such that Dovecot will proxy the connection to another machine if required (or continue on this machine if you are already on the correct proxy_host). I believe this can be done using proxy_maybe. Depending on the choice of load balancer you may need to configure Dovecot to understand HaProxy's PROXY format.

We are going to use the maildir format. And store those files on the ObjectiveFS filesystem. The index to Dovecot is stored on a local SSD.

Choice motivations
===============

ObjectiveFS:
  • Can probably be another clustered file system as well.
  • But ObjectiveFS works quite well because it has support for handling many small files. And can be configured with a local SSD disk.
  • Using this software can get a clustered file system to run quite easy.
Regarding the mailbox format:
Why don't use Dovecot Director as well?
  • You can not run Dovecot Director on the same machine as your Dovecot Backend / proxy. Thus we would need to introduce 2 additional hosts to just accept and route the traffic. This would increase the complexity of our solution.
  • But it could be build on top of this set-up fairly easy.
  • I don't see too many advantages for Director however currently.
    • One big advantage could be if it would automatically remove failed nodes. However this only works if you use Dovemon. And that is packaged with OX/Dovecot and thus requires paying per user.
    • So in case you want to put a node down you get a slightly easier API to do so. Namely doveadm director add <backend server ip> 0. However we can simply imitate this by updating our database and changing the proxy_host to another destination as well.
Load balancer
  • You could also set-up your MX records to directly point to your Dovecot nodes. This should work as well I guess.
  • But I think the load balancer makes for better control, over changing and managing DNS.
Questions
========

In general I'd like to know whether this is a good idea!

But some other questions I have at this point that could make or break this setup:
  • Can a Dovecot backend instance be configured to also run Dovecot proxy?
  • Can a Dovecot Proxy node receive HaProxy PROXY traffic?
Details
=====

Once I am sure I am on the right track I'd like to post the settings I use for the software as well. But let's first discuss the stuff above!

Best,
Roel van Duijnhoven