On 23.8.2011, at 9.52, Angel L. Mateo wrote:
With v2.0 you could if you use Dovecot proxy (or director) you can also proxy doveadm connections through it, so a "doveadm index" would always go to the correct server. http://wiki2.dovecot.org/Director at the bottom has some info how to set this up (works also with plain proxy, without director). I'm trying this configuration in a test environment, but we are having lot of problems with director. The main problem is with director and LMTP, because it produces a lot timeout of errors (I have previouslly posted about these problems)
Yes, I should look into the LMTP proxy problems.. Those are kind of difficult to debug though since I've never been able to reproduce them. In any case, you could initially move to v2.0 + director without LMTP (i.e. deliver to Maildir directly, then run the doveadm index).
OK. So my question is, does it worth? Our scenario is 8 POP/IMAP servers with almost 70000 users (not all of them are really active), about 8.5 TB in use, with mailboxes in Maildir format over NFS. Our main problem with this is at return of vacations periods (like the one we'll have next 9/1). Our hypothesis is that the first connection of the user is expensive, because he has a lot of unindexed messages in his mailbox. Supposing that doveadm index indexes the mailbox correctly, does it helps to solve our problem?
Yes, if there's a ton of people returning at the same time it'll create a load spike. It's at least partially because mails aren't indexed, so Dovecot has to first read the message headers (and maybe bodies) to produce the initial message list, and afterwards when user actually reads/downloads the message bodies they're re-read from disk, unless the OS still has them cached.
So this kind of preindexing would definitely reduce the CPU load during the spike, but I'm not entirely sure about disk load because of the OS caching (10-50% decrease?). I'd be really interested in seeing actual numbers some day. :)