[Dovecot] Newbie questions: Load-balanced Dovecot with NFS storage
Hi,
I experimented with Dovecot a while back and our site is now taking
the plunge to switch to it from the UW IMAP server when we rejig the
topology of our mail store and IMAP servers.
Currently we have a farm of 5 IMAP servers running the UW server.
Each has locally attached disk used to store people's mailboxes.
Each user is allocated to one of these five servers and can only
access their mail via that server. Our aim is to remove this
restriction to provide resilience in the event of a server failing.
Our thoughts...
We are envisaging having a farm of N (to be determined) servers
running Dovecot. These will be closed systems: users do not have shell- access, and can only reach their mail via IMAP.A single mailstore common to all N servers will be hosted on our
NetApp filer and NFS-mounted on each server. Each server would be running Solaris 10. If we use all-new kit it will be x86-based, or we
may have to redeploy some of our existing SPARC boxes to the new service.We will be converting user mailboxes to be in Maildir format.
Our hardware load balancer, which supports persistent sessions,
will be used to distribute users between the servers.This could either direct a user to any one of the N servers, or
could be set to 'prefer' a given server for a particular user (only
failing over to one of the others if their 'usual' server was unavailable).
More on this in my questions below. :-)
I have been scouring the Dovecot list, archives and Wiki and think I
have come up with a set of issues we need to be aware of and things
we have to do in order to make this work reliably.
Please would you let me know if I have missed or misunderstood anything?
Issues To Watch For
For timestamp integrity our servers will be synched with NTP to our local time server.
Because sessions could be on different servers memory mapping of
index files doesn't work well with NFS, so set mmap_disable=yes in dovecot.conf.Dovecot relies on the mtime timestamp of mailbox files so the NFS-mounted mailstore needs to be mounted with these options on each server: actimeo=0 (Are there any other mount options we should use too?)
We will be using filesystem quotas on users mailboxes. We understand that Dovecot's index files are best on no-quota filestores so
will store these separately.
Questions
Q1. Would it be better to store the index files on NFS-shared
filestore and
direct users to any of the IMAP server machines? Or to store
the index
files on local disk and direct each user to their 'preferred' IMAP
server machine?
Q2. Does Dovecot (or "something") clean out old index files that haven't
been accessed for a while? Eg, when a user has temporarily come
through on a different IMAP server to normal. Or do the index
files
sit there untouched for evermore?
Q3. Storing the index files on the NetApp filer would give us the
ability
to share them between servers and grow their volume as need be,
but at
the cost of performance. How big do the index files get? Are they
typically a few kBytes per message? Per mailbox? Per user? Or more?
Q4. We will be using Exim as the MTA, which can deliver direct to
Maildir
mailboxes. However I understand that Dovecot's "deliver" LDA
adds the
benefit of updating the index files as each message is
delivered. Is
this a significant gain? Or is there little difference in
actual use?
Q5. We have around 20,000 mail accounts and will therefore be seeing
lots
of concurrent IMAP sessions, usually secure (SSL) ones. I have
seen
mention that this can give rise to "Too many open files" errors
under
Solaris. How do we avoid this when we are likely to have several
thousand concurrent IMAP sessions per server machine?
With many thanks for your time and advice...
Cheers, Mike Brudenell
-- The Computing Service, University of York, Heslington, York Yo10 5DD, UK Tel:+44-1904-433811 FAX:+44-1904-433740
- Unsolicited commercial e-mail is NOT welcome at this e-mail address. *
On Fri, Mar 02, 2007 at 04:12:10PM +0000, Mike Brudenell wrote:
- Because sessions could be on different servers memory mapping of
index files doesn't work well with NFS, so set mmap_disable=yes in dovecot.conf.
If your load balancer is set up to have persistent servers based upon user criteria if some sort, you could actually store the indexes on local drives on each machine. Worst case scenario, if user flipped to a different box in the cluster, Dovecot would have to rebuild its index increasing CPU and I/O. Best case, you see a performance gain by using local store and reducing NFS traffic.
- Dovecot relies on the mtime timestamp of mailbox files so the NFS-mounted mailstore needs to be mounted with these options on each server: actimeo=0 (Are there any other mount options we should use too?)
This is absolutely critical, yes. Without this, all sorts of nasty things can happen. You can also use the "noac" attribute, at least as of Solaris 8.
- We will be using filesystem quotas on users mailboxes. We understand that Dovecot's index files are best on no-quota filestores so
will store these separately.
The only issue here is dotlocking. If your MTA or if Dovecot uses dotlocks of any kind, you'll need to be sure those are also outside of your quota-restricted filestore. Otherwise, you are going to need to be sure that the filesystem quota is never actually reached, and that the limit is artificially set lower than the filesystem quota in some other manner.
Q1. Would it be better to store the index files on NFS-shared
filestore and direct users to any of the IMAP server machines? Or to store
the index files on local disk and direct each user to their 'preferred' IMAP server machine?
Our plan is to store index files on local store and load balance to persistent servers. Sure, the persistent cache table expires over time, but then again, the indexes get out of date over time anyway.
Q2. Does Dovecot (or "something") clean out old index files that haven't been accessed for a while? Eg, when a user has temporarily come through on a different IMAP server to normal. Or do the index
files sit there untouched for evermore?
They sit untouched forever. Feel free to remove them after they get to be of certain age.
Q3. Storing the index files on the NetApp filer would give us the
ability to share them between servers and grow their volume as need be,
but at the cost of performance. How big do the index files get? Are they typically a few kBytes per message? Per mailbox? Per user? Or more?
I considered this, but my concern is reliability. Dovecot's index files seem a bit "delicate" in recent patches and I'm afraid of possible issues of sharing them between servers, especially if there are multiple IMAP sessions open on different servers. I'm not familiar with ultimate size at this point, but it depends on the maximum size of the mailboxes and folders I suppose.
Q4. We will be using Exim as the MTA, which can deliver direct to
Maildir mailboxes. However I understand that Dovecot's "deliver" LDA
adds the benefit of updating the index files as each message is
delivered. Is this a significant gain? Or is there little difference in
actual use?
No experience with the LDA component.
Q5. We have around 20,000 mail accounts and will therefore be seeing
lots of concurrent IMAP sessions, usually secure (SSL) ones. I have
seen mention that this can give rise to "Too many open files" errors
under Solaris. How do we avoid this when we are likely to have several thousand concurrent IMAP sessions per server machine?
Yow. Thousands of concurrent IMAP sessions *per* server? All using SSL? With only 20,000 mail accounts? Are you sure about that? That seems like an awfully high active-reader ratio given the low number of accounts. Still, if true, it is what it is and needs to be accomodated.
Obviously increasing the number of systems in your cluster is one way to fight it. I know there was a recent bug in Dovecot that was causing file descriptor leaks, but if I recall it was fixed in a recent patch.
Still, if you are going to have thousands of concurrent IMAP sessions, I would consider making sure you have a good number of systems in your cluster.
I'm not sure if you are using a webmail client, such as Squirrelmail, but if so you may also want to consider running an IMAP proxy server to keep sessions open and persistent between page loads.
-- Dean Brooks dean@iglou.com
Just to expand and give some more information in response to Dean's
helpful message, in case it helps anyone else when replying to my
enquiry...
On 2 Mar 2007, at 16:28, Dean Brooks wrote:
On Fri, Mar 02, 2007 at 04:12:10PM +0000, Mike Brudenell wrote:
If your load balancer is set up to have persistent servers based upon user criteria if some sort, you could actually store the indexes on local drives on each machine. Worst case scenario, if user flipped to a different box in the cluster, Dovecot would have to rebuild its index increasing CPU and I/O. Best case, you see a performance gain by using local store and reducing NFS traffic.
I believe for IMAP sessions we only have the option of maintaining
persistence based on the IP address of the person making the request,
not on user credentials used for authentication etc.
However if we go down the "try to keep a user on their 'preferred'
IMAP server for every session" path I'm planning to use the load
balancer's fail-over facility...
Basically for each load-balanced service you can set up two pools of
servers. If ANY of the servers in Pool 1 is available they get used;
if none are then a server in Pool 2 is used.
We already give each user their own personal DNS name to access
'their' mail server: for example mine would be
pmb1.imap.york.ac.uk. This currently maps direct to my IMAP server:
imap0.york.ac.uk.
But what I am thinking of is pointing imap0.york.ac.uk at our load
balancer and setting this with the preferred IMAP server for me as
the only machine listed in Pool 1, but with all the other IMAP
servers in Pool 2. Thus whenever I connect I will get routed through
to to 'my' IMAP server. However if it is unavailable then I will get
routed to one of the others and persistence will then keep the rest
of my session there.
Q1. Would it be better to store the index files on NFS-shared filestore and direct users to any of the IMAP server machines? Or to store the index files on local disk and direct each user to their 'preferred'
IMAP server machine?Our plan is to store index files on local store and load balance to persistent servers. Sure, the persistent cache table expires over time, but then again, the indexes get out of date over time anyway.
Assuming our trick of implementing preferred servers works OK I'm
tempted to use local disk for the indexes too. I just need to get a
feel for how big these things grow. Even knowing roughly how much it
needs per message (bytes? Kbytes?) might give me a clue to start
with. Suggestions, anyone?
Q2. Does Dovecot (or "something") clean out old index files that
haven't been accessed for a while? Eg, when a user has temporarily come through on a different IMAP server to normal. Or do the index files sit there untouched for evermore?They sit untouched forever. Feel free to remove them after they
get to be of certain age.
So if we were to have a cron job scan and delete old ones (like we do
with /tmp now) we should be OK? There wouldn't be any Nasty Things
happen if we deleted an index file that turned out to still be in
use, even though it hadn't apparently been used for ages?
Q5. We have around 20,000 mail accounts and will therefore be seeing lots of concurrent IMAP sessions, usually secure (SSL) ones. I have seen mention that this can give rise to "Too many open files" errors under Solaris. How do we avoid this when we are likely to have several thousand concurrent IMAP sessions per server machine?
Yow. Thousands of concurrent IMAP sessions *per* server? All using SSL? With only 20,000 mail accounts? Are you sure about that? That seems like an awfully high active-reader ratio given the low number of accounts. Still, if true, it is what it is and needs to be
accomodated.
I've just checked and right at this moment we have between 600 and
800 IMAP server sessions running on each of our 5 servers. I'm
fairly certain that at our peak times we see 1,000-2,000 concurrently
on each server. I'll start eye-balling them over the next few weekdays.
The main point I was trying to make is that there would be
significantly more than the 256 that was mentioned in previous posts
on this subject.
I'm not sure if you are using a webmail client, such as Squirrelmail, but if so you may also want to consider running an IMAP proxy server to keep sessions open and persistent between page loads.
Thankfully this shouldn't be an issue for us. We use the University
of Cambridge's "Prayer" webmail software. Rather than being
something you run under a web server such as Apache it's actually a
custom-written HTTP to IMAP gateway and so gives a number of very
special benefits:
- Persistent browser-to-gateway sessions (where supported)
- Persistent gateway-to-imap-server sessions
- Aggressive caching
- GZipping of data en route to browser (benefits slow connections)
It is VERY speedy to use: a number of our users have been impressed,
even when working at the end of a slow dialup link around the world
from us in Australia.
Cheers, Mike B-)
-- The Computing Service, University of York, Heslington, York Yo10 5DD, UK Tel:+44-1904-433811 FAX:+44-1904-433740
- Unsolicited commercial e-mail is NOT welcome at this e-mail address. *
On Mar 2 2007, Mike Brudenell wrote:
Thankfully this shouldn't be an issue for us. We use the University
of Cambridge's "Prayer" webmail software. Rather than being
something you run under a web server such as Apache it's actually a
custom-written HTTP to IMAP gateway and so gives a number of very
special benefits:
- Persistent browser-to-gateway sessions (where supported)
- Persistent gateway-to-imap-server sessions
- Aggressive caching
- GZipping of data en route to browser (benefits slow connections)
It is VERY speedy to use: a number of our users have been impressed,
even when working at the end of a slow dialup link around the world
from us in Australia.
I'll concur with this note about Prayer. We have a very similar overall setup, including Prayer as one of our two Webmail offerings--on a normal day, it bears over half our user load (something like 35K out of 65K users, with ~8.5K peak concurrent sessions). The interface is heavily modified, but the guts are still Prayer. In our tests it, frankly, bested any other similar free product by a good margin (with the exception of the University of Oregon's AlphaMail).
To get back on topic, we were suffering from the 256-file-descriptor issue under Solaris 10 (with all IMAP/POP3 connections SSL-only), and I can confirm that it is fixed in recent RCs. We're currently ramping up the user count on our initial Dovecot system and so far it is going smoothly.
-Brian Hayden OIT Internet Services University of MN
On 2.3.2007, at 18.58, Mike Brudenell wrote:
Assuming our trick of implementing preferred servers works OK I'm
tempted to use local disk for the indexes too. I just need to get
a feel for how big these things grow. Even knowing roughly how
much it needs per message (bytes? Kbytes?) might give me a clue to
start with. Suggestions, anyone?
My INBOX is 40MB and my index files are currently:
-rw------- 1 cras users 75736 Mar 2 23:36 dovecot.index
-rw------- 1 cras users 1559552 Mar 2 23:36
dovecot.index.cache
-rw------- 1 cras users 95112 Mar 2 23:36
dovecot.index.log
-rw------- 1 cras users 131112 Feb 17 17:32
dovecot.index.log.2
The cache file's size depends heavily on what client is being used
and possibly also things like client-side filter rules etc.
Q2. Does Dovecot (or "something") clean out old index files that
haven't been accessed for a while? Eg, when a user has temporarily come through on a different IMAP server to normal. Or do the index files sit there untouched for evermore?They sit untouched forever. Feel free to remove them after they
get to be of certain age.So if we were to have a cron job scan and delete old ones (like we
do with /tmp now) we should be OK? There wouldn't be any Nasty
Things happen if we deleted an index file that turned out to still
be in use, even though it hadn't apparently been used for ages?
You could find based on atime, unless of course you've disabled atime
updates.. Anyway, the worst that can happen when deleting index file
that's already in use is that it logs some error and disconnects the
user. The next login will work.
Mike Brudenell wrote:
Our plan is to store index files on local store and load balance to persistent servers. Sure, the persistent cache table expires over time, but then again, the indexes get out of date over time anyway.
Assuming our trick of implementing preferred servers works OK I'm tempted to use local disk for the indexes too. I just need to get a feel for how big these things grow. Even knowing roughly how much it needs per message (bytes? Kbytes?) might give me a clue to start with. Suggestions, anyone?
We have indexes on a separate partition for our staff. It works out at 650GB folders + 90GB inboxes + 8.3GB indexes (both folders + inboxes). For one of our undergrad partitions it is 35GB folders + 150GB inboxes + 6.4GB indexes.
This is using mbox format, and I rather suspect most of the staff folders are accessed infrequently and may not even have any Dovecot indexes yet!
Yow. Thousands of concurrent IMAP sessions *per* server? All using SSL? With only 20,000 mail accounts? Are you sure about that? That seems like an awfully high active-reader ratio given the low number of accounts. Still, if true, it is what it is and needs to be accomodated.
I've just checked and right at this moment we have between 600 and 800 IMAP server sessions running on each of our 5 servers. I'm fairly certain that at our peak times we see 1,000-2,000 concurrently on each server. I'll start eye-balling them over the next few weekdays.
The main point I was trying to make is that there would be significantly more than the 256 that was mentioned in previous posts on this subject.
We use "login_process_per_connection = no" and change the number of file descriptors for the Dovecot master process with a "plimit -n 4096". You can also change the default in Solaris by modifying /etc/system (check the list archives for details - we discussed this recently).
I'm not sure if you are using a webmail client, such as Squirrelmail, but if so you may also want to consider running an IMAP proxy server to keep sessions open and persistent between page loads.
Thankfully this shouldn't be an issue for us. We use the University of Cambridge's "Prayer" webmail software.
We use Prayer too, but I considered using imapproxy (http://www.imapproxy.org) for, say, Squirrelmail. I'd have thought that Dovecot's indexes would make the proxy unnecessary, though, unless authentication is very expensive (and Dovecot can cache that too). imapproxy can't usefully cache SELECTs, so Dovecot would probably have to read the cache/index files for each connection.
We have NetApps too, and I've been thinking about moving the mail store to them (possibly converting to maildir on the way), but unfortunately, the Powers That Be have decided all staff are moving to Exchange (with message store on the NetApps via iSCSI) and the students may be moved to an outsourced solution!
Best Wishes, Chris
-- --+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+- Christopher Wakelin, c.d.wakelin@reading.ac.uk IT Services Centre, The University of Reading, Tel: +44 (0)118 378 8439 Whiteknights, Reading, RG6 2AF, UK Fax: +44 (0)118 975 3094
Greetings -
I've got Dovecot 1.0rc24 up and running with both Maildirs and index
files in the same user directory on NFS-mounted filestore. Now I'm
slowly edging things into a better, more efficient and NFS-friendly
setup...
Currently I'm trying to move the index and control files off the NFS- mounted quota'd filestore and onto local disk without quotas.
Q1. We are going to be running multiple load-balanced servers. I know
I can safely put the index files onto locally attached disk and
they
are rebuilt as and when needed.
But is the same true for control files? Or do they need to be
preserved and made available to any server the user may come in
through? (This would imply having to have a different NFS-mounted
filestore, as the NFS-mounted mailstore will have filestore quotas
enabled, and I understand control files can be unhappy in this
environment when the user runs out of quota.)
To break up large directories I want to create these in the form /var/dovecot/X/username/control /var/dovecot/X/username/index where X is the first letter of the username. So in dovecot.conf I have:
mail_location = maildir:/mailstore/%1u/%u/Maildir:INDEX=/var/dovecot/% 1u/%u/index:CONTROL=/var/dovecot/%1u/%u/control
Q2. An alternative would be to arrange the index and control diretories
so they both hang off a single per-user area:
/var/dovecot/X/username/control
/var/dovecot/X/username/index
However this again assumes I can safely delete files in the
control
directory tree if they aren't touched for a while. Is this true,
or are they 'valuable' and should be preserved for the lifetime of
the account?
(This is in part Q1 but phrased differently, I guess. :-)
I also hit some ownership/protection problems. I'd originally
assumed that I just had to create the top level
/var/dovecot/{control,index}
directories and these could be owned root:root with mode 755. But
Dovecot fails to create these and logs entries such as:
dovecot: Mar 07 12:00:47 Error: IMAP(pmb1)[2835]:
mkdir(/var/dovecot/index/p/pmb1/.INBOX) failed: Permission
denied
Experimentation suggests that the instance of Dovecot wanting to
create the directories is running under the user's uid/gid (me, in
this case). Hence I now believe:
I need to manually pre-create the next level down directories /var/dovecot/{control,index}/[a-z]
Set these to be owned root:root with mode rwxrwxrwt
I can periodically clean out 'old' files in the index tree. (Not sure about the control tree: hence Q1 and Q2 above.)
This seems to work OK... I just wanted to check I hadn't missed
something simpler and blindingly obvious?
Cheers, Mike B-)
-- The Computing Service, University of York, Heslington, York Yo10 5DD, UK Tel:+44-1904-433811 FAX:+44-1904-433740
- Unsolicited commercial e-mail is NOT welcome at this e-mail address. *
On Wed, 2007-03-07 at 12:15 +0000, Mike Brudenell wrote:
But is the same true for control files? Or do they need to be preserved and made available to any server the user may come in through?
Control files are important. If you delete them the messages will get new UIDs which causes client to download them again. That's especially bad with POP3 if client is configured to leave the messages to server.
Experimentation suggests that the instance of Dovecot wanting to
create the directories is running under the user's uid/gid (me, in
this case). Hence I now believe:
I need to manually pre-create the next level down directories /var/dovecot/{control,index}/[a-z]
Set these to be owned root:root with mode rwxrwxrwt
Yea. I guess some day in future Dovecot could support creating the home dir while still running as root.
Thanks for the reply, Timo! So just to double-check/clarify...
On 7 Mar 2007, at 13:08, Timo Sirainen wrote:
On Wed, 2007-03-07 at 12:15 +0000, Mike Brudenell wrote:
But is the same true for control files? Or do they need to be preserved and made available to any server the user may come in through?
Control files are important. If you delete them the messages will get new UIDs which causes client to download them again. That's especially bad with POP3 if client is configured to leave the messages to server.
OK, so as we want to use file system quotas that sounds as if I need
this setup:
Maildirs stored in one NFS-mounted directory tree with quotas
enabled.Control files stored in another NFS-mounted directory tree *WITHOUT* quotas enabled, as per http://wiki.dovecot.org/Quota/FS
??? Then NFS-mount this one area on all of the load-balanced IMAP
servers?
(I'm assuming potentially several instances of Dovecot running on
on different server machines will handle this OK over NFS?)
- Index files stored in directory tree on locally attached disk. (And a cron job to clean old ones out periodically.)
With regard to mount options for the above:
Needs no special NFS-mount options.
Does this need any special mount options? (eg, "actimeo=0" as for index file on NFS?)
If this had been on NFS-mounted filestore I'd have needed the "actimeo=0" mount option and to use "mmap_disable = yes". But for locally attached disk I don't need to use either of these.
With many thanks, Mike B-)
-- The Computing Service, University of York, Heslington, York Yo10 5DD, UK Tel:+44-1904-433811 FAX:+44-1904-433740
- Unsolicited commercial e-mail is NOT welcome at this e-mail address. *
On Wed, 2007-03-07 at 14:31 +0000, Mike Brudenell wrote:
OK, so as we want to use file system quotas that sounds as if I need
this setup:
Maildirs stored in one NFS-mounted directory tree with quotas
enabled.Control files stored in another NFS-mounted directory tree *WITHOUT* quotas enabled, as per http://wiki.dovecot.org/Quota/FS
??? Then NFS-mount this one area on all of the load-balanced IMAP
servers? (I'm assuming potentially several instances of Dovecot running on on different server machines will handle this OK over NFS?)
Yep.
- Index files stored in directory tree on locally attached disk. (And a cron job to clean old ones out periodically.)
With regard to mount options for the above:
Needs no special NFS-mount options.
Does this need any special mount options? (eg, "actimeo=0" as for index file on NFS?)
actimeo=0 would be a good idea, otherwise there could be some rare problems when two computers are accessing the same user's mailbox at the same time.
- If this had been on NFS-mounted filestore I'd have needed the "actimeo=0" mount option and to use "mmap_disable = yes". But for locally attached disk I don't need to use either of these.
Right.
participants (5)
-
bhayden@umn.edu
-
Chris Wakelin
-
Dean Brooks
-
Mike Brudenell
-
Timo Sirainen