[Dovecot] Dovecot load balancing
Hello,
I'd like to figure out how to set up a site running dovecot on multiple computers for load balancing reasons.
I'm currently running dovecot-1.0.14 on a 8GB RAM, 2 quad-core 2.66Ghz Xeon single server which runs FreeBSD 6.2-STABLE SMP.
. the mailbox format was mbox and I successfully migrated to Maildir only . mailboxes are on an NFS NetApp server, attribute caching is off . indexes are locally stored . the number of users is around 2500, some of them having huge mailboxes (~ 1 or 2 GB) . a large amount of users is using IMAP but there are also many POP3 users . the passdb/userdb is openLDAP . for now, I neither use namespaces, nor quotas nor deliver, nor any plugin, but I plan to use namespaces and ACL to implement shared mailboxes.
The same machine is running postfix-2.4.6 with amavisd-new-2.5.4 (no spamassassin) and clamav-0.93.1. I'm using procmail-3.22 as the LDA.
I installed dovecot-1.1.2 on non-standard ports for test and plan to switch to it as I read on this list it performs better, load wise.
This setup (1.0.14) works great except that, compared to the time mailboxes were in mbox format, the load average (as shown by 'top') goes higher from a significant amout on a regular basis and sometimes goes over the top.
On "normal" circumstances, the number of processes is around 1400 and the load oscillates between 1 to 10 and 30 to 60. Most of the time, I'd say the load average is around 20.
Every friday a message (no attachement, decent size) is sent to all users : in such circumstances (but only sometimes, not every time : let's say half the time such a message is sent), the load goes so high that I have to stop dovecot to let procmail deliver the message or even reboot the machine.
Sometimes, the load climbs up to something like 150, then goes back to the 'normal' case described above.
I know multi-master replication is on the roadmap. I know some dovecot sites use in the meantime several dovecot servers. I'd like to know how those sites do the load balancing, the main problem beeing to be able to direct each user to the same dovecot server each time as stated in the Wiki (http://wiki.dovecot.org/NFS).
I don't think that DNS round robin would do the trick because some UA (for instance Thunderbird) often open up to 5 connexions for the same user, unless maybe such UA makes a single 'gethostbyname()', thus connecting 5 times to the same physical server ?
What are the options to achieve such a setup ? Any successful experiences ?
Thank you
-- Thomas Hummel | Institut Pasteur hummel@pasteur.fr | Pôle informatique - systèmes et réseau
On Thu, Jul 31, 2008 at 2:15 PM, Thomas Hummel hummel@pasteur.fr wrote:
Every friday a message (no attachement, decent size) is sent to all users : in such circumstances (but only sometimes, not every time : let's say half the time such a message is sent), the load goes so high that I have to stop dovecot to let procmail deliver the message or even reboot the machine.
Do you use dovecot's delivery agent (LDA)? Dovecot's LDA updates dovecot's cache file correctly. If you use a different LDA, dovecot has to re-read directory contents every time (and update the cache file).
I know multi-master replication is on the roadmap. I know some dovecot sites use in the meantime several dovecot servers. I'd like to know how those sites do the load balancing, the main problem beeing to be able to direct each user to the same dovecot server each time as stated in the Wiki (http://wiki.dovecot.org/NFS).
If you use multiple servers you can use dovecot's proxy-feature to redirect the user to the correct server. You just need a database to tell dovecot (and the MTA) on how to find the correct location for that specific user.
Chris
On Thu, Jul 31, 2008 at 03:07:28PM +0200, Chris Laif wrote:
Do you use dovecot's delivery agent (LDA)?
As I stated, no. I need the procmail filtering features.
If you use multiple servers you can use dovecot's proxy-feature to redirect the user to the correct server. You just need a database to tell dovecot (and the MTA) on how to find the correct location for that specific user.
I see, something like adding a hostName LDAP attribute to all my users in LDAP dans putting this in dovecot-ldap.conf :
pass_attrs=uid=user,userPassword=password,proxy,hostName=host
correct ?
I don't quite understand the proxy_maybe option :
proxy_maybe can be used to implement "automatic proxying". If the proxy destination matches the current connection, the user gets logged in normally instead of being proxied. If the same happens with proxy, the login fails with "Proxying loops" error. This feature exists only in v1.1+.
Besides, let's say you want to split your load between 2 dovecot servers : would you need a 3rd that would do only the proxying ?
-- Thomas Hummel | Institut Pasteur hummel@pasteur.fr | Pôle informatique - systèmes et réseau
On Thu, Jul 31, 2008 at 03:26:06PM +0200, Thomas Hummel wrote:
I don't quite understand the proxy_maybe option :
Also, 2 things which aren't quite clear to me in the Wiki :
a) Password forwarding
Make sure that the authentication succeeds with any given password. You can do this by using empty passwords. v1.1+ requires also that you return nopassword field.
-> Does that mean that the proxy has to accept only empty passwords and that that's the actual imap server that will deal with the actual password ?
b) The connections created to the destination server can't be TLS/SSL encrypted.
Does it still work if the client is using SSL/TLS to connect to the proxy ?
-- Thomas Hummel | Institut Pasteur hummel@pasteur.fr | Pôle informatique - systèmes et réseau
On Jul 31, 2008, at 8:44 AM, Thomas Hummel wrote:
On Thu, Jul 31, 2008 at 03:26:06PM +0200, Thomas Hummel wrote:
I don't quite understand the proxy_maybe option :
Also, 2 things which aren't quite clear to me in the Wiki :
a) Password forwarding
Make sure that the authentication succeeds with any given password.
You can do this by using empty passwords. v1.1+ requires also that
you return nopassword field.-> Does that mean that the proxy has to accept only empty passwords
and that that's the actual imap server that will deal with the actual
password ?b) The connections created to the destination server can't be TLS/ SSL encrypted.
Does it still work if the client is using SSL/TLS to connect to the
proxy ?
It seems to me it might be better to use something like UltraMonkey
to do load balancing. I'm going to go down this road in a week or so.
Rick
Thomas Hummel wrote:
On Thu, Jul 31, 2008 at 03:26:06PM +0200, Thomas Hummel wrote:
I don't quite understand the proxy_maybe option :
The proxy_maybe allows you to have a user log into a server that is both doing proxy logins for another host as well as local logins. So User A connects into server 1, they live on server 2 so server 1 proxies the connection onto server 2. User B connects into server 1 and they live on server 1, so proxy_maybe allows the connect to be made direct even though their proxy setting says they go to a specific host (which happens to be server 1)
Also, 2 things which aren't quite clear to me in the Wiki :
a) Password forwarding
Make sure that the authentication succeeds with any given password. You can do this by using empty passwords. v1.1+ requires also that you return nopassword field.
-> Does that mean that the proxy has to accept only empty passwords and that that's the actual imap server that will deal with the actual password ?
The destination host must be set to allow plain text passwords.
b) The connections created to the destination server can't be TLS/SSL encrypted.
Does it still work if the client is using SSL/TLS to connect to the proxy ?
Yes the initial connection can be done using SSL/TLS. What happens is the proxy will do the auth for the user using their password and if it succeeds and they have a proxy attribute setup then the connect is made to the destination host using a plaintext connection. What you can do is setup a dovecot proxy host(s) that has no users assigned to that server and allows only SSL/TLS connections, then on the backend a bunch of servers that users get assigned to but they cannot have: disable_plaintext_auth = yes in the configuration.
On Thu, Jul 31, 2008 at 10:18:22AM -0400, Eric Toczek wrote:
connection onto server 2. User B connects into server 1 and they live on server 1, so proxy_maybe allows the connect to be made direct even though their proxy setting says they go to a specific host (which happens to be server 1)
You mean that just save one imap-login process ?
Also, 2 things which aren't quite clear to me in the Wiki :
a) Password forwarding
Make sure that the authentication succeeds with any given password. You can do this by using empty passwords. v1.1+ requires also that you return nopassword field.
-> Does that mean that the proxy has to accept only empty passwords and that that's the actual imap server that will deal with the actual password ?
The destination host must be set to allow plain text passwords.
Granted, but I guess it's the proxy which must accept empty password ? I don't get where in the picture the empty password stands.
Yes the initial connection can be done using SSL/TLS. What happens is the proxy will do the auth for the user using their password and if it succeeds and they have a proxy attribute setup then the connect is made to the destination host using a plaintext connection. What you can do is setup a dovecot proxy host(s) that has no users assigned to that server and allows only SSL/TLS connections, then on the backend a bunch of servers that users get assigned to but they cannot have: disable_plaintext_auth = yes in the configuration.
Ok, I always use plaintext anyway, usually on SSL/TLS from clients (with a few exceptions for old stuff/bad habits I plan to get ridd off).
-- Thomas Hummel | Institut Pasteur hummel@pasteur.fr | Pôle informatique - systèmes et réseau
Do you use dovecot's delivery agent (LDA)? As I stated, no. I need the procmail filtering features.
Call dovecot LDA inside of procmail! It will speed up your configuration certainly.
procmailrc should contain something like:
DELIVER="/path/to/dovecot/deliver" :0 w | $DELIVER
Works like a charm.
Kind regards Sven
On Thu, Jul 31, 2008 at 04:13:51PM +0200, Sven Eulberg wrote:
Call dovecot LDA inside of procmail!
[...]
Works like a charm.
Thanks, I was thinking about something like that.
-- Thomas Hummel | Institut Pasteur hummel@pasteur.fr | Pôle informatique - systèmes et réseau
I don't quite understand the proxy_maybe option :
proxy_maybe can be used to implement "automatic proxying". If the proxy destination matches the current connection, the user gets logged in normally instead of being proxied. If the same happens with proxy, the login fails with "Proxying loops" error. This feature exists only in v1.1+.
This was my feature request. Normally proxying is expecting to get a
result and proxy to that server. In the case that the result might
actually be "we are here already" then you need to make your query
return null if you are already on the correct server to disable the
proxying. This is simple with an sql db, but less easy with ldap.
proxy_maybe automatically checks the result of the proxy lookup and if
you are already on the correct server it doesn't proxy anything ('cause
you are already there)
Ed W
On Thu, Jul 31, 2008 at 3:26 PM, Thomas Hummel hummel@pasteur.fr wrote:
On Thu, Jul 31, 2008 at 03:07:28PM +0200, Chris Laif wrote:
Do you use dovecot's delivery agent (LDA)?
As I stated, no. I need the procmail filtering features.
As stated in the other mails, you can use procmail *and* dovecot's LDA. then, your cache files are updated on *every* operation on a mailbox (storing and fetching mails)
If you use multiple servers you can use dovecot's proxy-feature to redirect the user to the correct server. You just need a database to tell dovecot (and the MTA) on how to find the correct location for that specific user.
I see, something like adding a hostName LDAP attribute to all my users in LDAP dans putting this in dovecot-ldap.conf :
I've no experience with LDAP. If you use a SQL-db the setup is non-trivial but doable :) For example, you've got 3 servers: If the user hits the right server, he accesses the local mailbox. If he hits the "wrong" server, the SQL-db finds out where his mails are stored and proxies him to the right server.
If you add SQL-replication, you've got an nice setup without a single point of failure (SPOF). If one server crashes, some (!) mailboxes are unavailble but you can restore a backup to one of the live servers and update the mail storage location in the SQL-db accordingly.
I do not like NFS that much (various NFS-problems, SPOF), therefore I recommend the above solution.
Chris
On Thu, Jul 31, 2008 at 7:15 AM, Thomas Hummel hummel@pasteur.fr wrote:
I don't think that DNS round robin would do the trick because some UA (for instance Thunderbird) often open up to 5 connexions for the same user, unless maybe such UA makes a single 'gethostbyname()', thus connecting 5 times to the same physical server ?
I am not seeing the issue, your seeing issues with the UA spreading the connects across multiple servers? If so are the message-ids different for imap://hummel@ on one machine versus another? The same filer for a backend right?
I assume that authentication, and more, is caching on Dovecot. So the actual LDAP hits are solitary for the whole lifetime of the UA session most likely.
Dovecot as a LDA is ideal, as stated above, it will do the initial indexing upon delivery of messages and dovecot-imap will fixup as needed.
Perhaps the UUIDS for pop and imap can be looked at for speed ups.
Having a lot of processes, and in what state are they in? If the bulk of them are running because of searches (like outlook crazy queries or virus checkers than scan 1gb of mail etc) Thats a problem that is a little outside load balancing it seems.
I would see the pre-delivery being more of a issue than dovecot performance wise especially with the LDA, hefty caching, and good maildir file naming.
-- Gabriel Millerd
On Thu, Jul 31, 2008 at 09:21:16AM -0500, Gabriel Millerd wrote:
I am not seeing the issue, your seeing issues with the UA spreading the connects across multiple servers?
I was thinking of what is described here in the Wiki :
"NFS caching is a big problem when multiple computers are accessing the same mailbox simultaneously. The best fix for this is to prevent it from happening. Configure your setup so that a user always gets redirected to the same server (unless it's down)."
and also
"If the user gets occasionally redirected to another server, the indexes will then be created locally there."
but maybe you mean that it isn't an issue with
"Dovecot v1.1 flushes NFS caches when needed if you set mail_nfs_storage=yes (and mail_nfs_index=yes if indexes are on NFS)." ?
Maybe if indexes are on the filer, each server can see the dovecot.index.cache and the only problem left is nfs attribute caching ?
The same filer for a backend right?
Right.
I would see the pre-delivery being more of a issue than dovecot performance wise especially with the LDA, hefty caching, and good maildir file naming.
What do you mean by good maildir file naming ?
-- Thomas Hummel | Institut Pasteur hummel@pasteur.fr | Pôle informatique - systèmes et réseau
Thomas Hummel wrote:
The same machine is running postfix-2.4.6 with amavisd-new-2.5.4 (no spamassassin) and clamav-0.93.1. I'm using procmail-3.22 as the LDA.
I installed dovecot-1.1.2 on non-standard ports for test and plan to switch to it as I read on this list it performs better, load wise.
Also, Dovecot v1.1 should do better with NFS.
Every friday a message (no attachement, decent size) is sent to all users : in such circumstances (but only sometimes, not every time : let's say half the time such a message is sent), the load goes so high that I have to stop dovecot to let procmail deliver the message or even reboot the machine.
How about doing some rate limiting in Postfix, or even moving that to a different server?
Anders.
participants (8)
-
Anders Melchiorsen
-
Chris Laif
-
Ed W
-
Eric Toczek
-
Gabriel Millerd
-
Rick Romero
-
Sven Eulberg
-
Thomas Hummel