Hello to the list! I've been asked to spec out the feasibility and, if feasible, plan a migration from Courier to Dovecot for both POP3 and IMAP for about 4 million mailboxes. I've been trying to absorb all the dovecot-related info I could over the past couple of weeks from the docs and from the list and I'm sure that I've muddled at least some of it, so apologies in advance if I make grossly incorrect assumptions.
BTW, sorry for the giant email but I've got a bunch of questions. This is the sort of overly wordy email that makes my eyes glaze over in mailing lists. But I've got things configured to where (I think) everything will 'just work' if we migrate (barring running POP3 mailbox conversions), so I'm not looking for configuration help, just gotchas and assumption-checking mostly. Me breaking 4 million mailboxes isn't really a good move for my job security ;)
Since Dovecot 2.0 seems like it's just around the corner, that's all I've been testing, and indeed all I've even looked at.
This is a pretty fantastic piece of software. I love the neat little details that make admins' lives easier, e.g. how easy it is to use multiple SSL certs in the same server. That made me smile. I hadn't looked at dovecot in a number of years, so I was really impressed at the full feature set. It's been a while since I've played with a new server application that I've enjoyed this much.
Background: All of our mail is stored on NFS and will be for the foreseeable future, all in maildir format. Courier POP3/IMAP runs on load balanced Linux servers, so clients will be hitting multiple servers, though stickiness based on client IP address is currently done. Webmail as well is done on top of a local Courier IMAP server accessing the same NFS servers but on an entirely different pool of servers. All authentication is done against SQL. Courier has been very, very good to us over the past 9 years, but our NFS servers (12 Netapps) are unfortunately being beaten to death. We've got lots of users with thousands of emails in their inbox (i.e. in ./Maildir/cur/ alone) and I've seen plenty of mailboxes with 10's of thousands in their inbox and I've even seen some really scary ones approaching 100k emails just in cur/. Thanks to an endless cycle of industry raising mailbox limits, we actually have to support this (though not so much the 100k message ones, at least).
Our #1 main motivation for looking Dovecot is relief for our currently overtaxed NFS servers, mostly in the form of the index files. Benchmarking dovecot looks great, even with the index files in the maildir. I know the cool thing to do would be to move the index files to a separate NFS server running entirely on SSDs for the fastest accesses possible. Building a separate index NFS cluster obviously adds complexity and performance with the index files in the maildir seems pretty great already. But I also know that real life often makes assumptions made during benchmarks look very foolish. My inclination is to try things out with the indexes in the maildir directory (and then start looking at moving to a dedicated NFS server for the indexes in a subsequent phase) but I don't want to regret that choice later. Anybody with large-scale NFS maildir stores have any advice either way?
Exim: We currently deliver all of our mail via Exim on separate servers. Our POP3/IMAP servers only do POP3/IMAP and the Exim mail servers delivering to maildirs only do Exim. From what I've seen in the docs and various threads, from what I can gather, the best thing to do in that case would be to use Exim's built-in maildir handling, instead of using 'deliver'. That would be my preference anyway, but I wanted to make sure I didn't misinterpret things.
Any problems running Courier POP3 and Dovecot IMAP for a while, possibly Courier IMAP and Dovecot IMAP concurrently?: Since migrating POP mailboxes is going to be a mighty task, our gameplan would probably be to start migrating *just* IMAP services at first and likely on just a few servers at a time. Being load balanced, there are several scenarios where a user could theoretically access a mailbox on different servers and therefore end up hitting dovecot at one point in time and then later hitting courier or vice versa. Or they could use IMAP from one location and POP3 (during the span of time before we migrated our POP3 services) from another. I've not yet seen any side effects from switching mail servers like that (though I'm aware of what would happen if both dovecot and courier *POP3* were in service at the same time -- redownloaded mail, bleh). Has anyone seen otherwise? I'm not worried as much by the 0.1% of IMAP users who might be using advanced features but rather the other 99.9% who are using it pretty vanilla-ishly.
Union mailboxes: I'm pretty sure in a fairly recent thread that Timo said that something like a 'union' mailbox (at least with maildir) wasn't possible. I tried messing with multiple 'private' namespaces (i.e. a namespace called "ARCHIVE" with a "location" different than the INBOX location, ideally placed on slower but denser NFS servers) but even with 'hidden=no' and 'list=yes', only the main INBOX folders would show up, so I'm guessing that's not going to work. That would be a killer feature, to be able to serve an alternate namespace that would show up in a mail client's subscribable list that could be on slower storage than the main inbox (though I'm not sure a mail client can even handle multiple namespaces).
Any problems with keeping only quota limits in db and not current quota numbers? Our limits come out of a SQL table but the current counts just live in the maildir file. Trying to update quota counts in SQL for 4 million mailboxes is a non-starter for us at this point.
From what I've done, dovecot seems pretty happy to maintain the quota using the 'maildir' quota plugin but let it be overridden by the userdb lookup. So I think I'm fine on that count, but I'd love to hear from anyone knowing of extreme gotchas doing it that way.
Any problem with leaving the namespace in "Courier compatibility" mode? I.e. in namespace 'private', leaving "prefix = INBOX.". With 4 million mailboxes, FAQs all over the place, and support reps trained in a particular way of doing things in IMAP, it'd be hellish to try to change the prefix (I know I could leave the courier namespace around with 'hidden=yes' but retraining support staff is perhaps better left to phase #2). Do I lose anything besides tidiness by not changing it to "prefix =" as if I was deploying dovecot from scratch? Does it hurt performance in any significant way? Benchmarking doesn't look any different, so I'm guessing not.
One thing that threw me and might be good for a FAQ (unless it was just me misconfiguring things) was when I started playing with putting the index files in an alternate location. I was utterly perplexed why it'd create the directories for the indexes but they'd be empty. Based on their location and names when they're in the maildir, I was just looking for the same dovecot.index* files right in the alternate directory. It wasn't till I started strace'ing that I noticed that the index files were indeed getting created but in a subdirectory called .INBOX (and with me just doing 'ls').
The courier-dovecot-migrate.pl does a fine job at converting our POP3 mailboxes. We'll definitely be doing a mass run of it across all of our mailboxes. But with this many mailboxes, there's no way to get around the time lag between the script getting run and new mail showing up. Which means that there'll be some number of new messages that will re-download in POP3 when the switchover to Dovecot POP3 is done. To keep from getting too many complaints from customers, I'm thinking that I might have to write a wrapper script to do something like compare the mtime of ./Maildir/new/ against dovecot-uidlist and overwrite if ./Maildir/new/ is newer or perhaps see if the mtime of courierpop3dsizelist is newer than dovecot-uidlist and only if that's the case execute the conversion script with --overwrite. Considering we get between 500-2000 POP3 logins/sec, running any script after a POP3 login terrifies me--even a 6 line bash script. Anybody have any opinions to share on that? With a pooled architecture, it's near impossible to only have some mailboxes hit Courier and some hit Dovecot and keep slowly incrementing. At best my available units of increments to migrate are a few thousand at a time and more realistically 10-20k at a time. If we're going to have to live with users complaining about a one-time redownload of just post-conversion mail, I'll need to get started convincing the higher-ups that that's life.
A big thanks to anyone who even actually reads this entire tedious email and a tremendous thanks to anyone who actually replies to any of it!