Out of interest, has the NFS issue been tested on NFS4? My understanding is that NFS4 has a lot of fixes for the locking/caching problems that plague NFS3, and we were planning to use NFS4 from day one.
If this hasn't been tested, is there some kind of load simulator that we could run to see if the issue does occur in our environment?
On 18.01.2012 21:54, Timo Sirainen wrote:
On Wed, 2012-01-18 at 20:44 +0800, Lee Standen wrote:
I've been desperately trying to find some comparative performance information about the different mailbox formats supported by Dovecot in order to make an assessment on which format is right for our environment.
Unfortunately there aren't really any. Everyone who seems to switch to sdbox/mdbox usually also change their hardware at the same time, so there aren't really any before/after metrics. I've of course some unrealistic synthetic benchmarks, but I don't think they are very useful.
So, I would also be very interested in seeing some before/after graphs of disk IO, CPU and memory usage of Maildir -> dbox switch in same hardware.
Maildir is anyway definitely worse performance then sdbox or mdbox. mdbox also uses less NFS operations, but I don't know how much faster (if any) it is with Netapps.
All mail storage presented via NFS over 10Gbps Ethernet (Jumbo Frames)
Postfix will feed new email to Dovecot via LMTP
Dovecot servers have been split based on their role
Dovecot LDA Servers (running LMTP protocol)
Dovecot POP/IMAP servers (running POP/IMAP protocols)
You're going to run into NFS caching troubles with the above split setup. I don't recommend it. You will see error messages about index corruption with it, and with dbox it can cause metadata loss. http://wiki2.dovecot.org/NFS http://wiki2.dovecot.org/Director
LDA & POP/IMAP servers are segmented into geographically split groups (so no server sees every single mailbox)
Nginx proxy used to terminate customer connections, connections are redirected to the appropriate geographic servers
Can the same mailbox still be accessed via multiple geographic servers? I've had some plans for doing this kind of access/replication using dsync..
- Apache Lucene indexes will be used to accelerate IMAP search for users
Dovecot's fts-solr or fts-lucene?
Our closest current live configuration (Qmail SMTP, Courier IMAP, Maildir) has 600K mailboxes and pushes ~ 35,000 NFS operations per second at peak
Some of the things I would like to know:
- Are we likely to see a reduction in IOPS/User by using Maildir alone under Dovecot?
If you have webmail type of clients, definitely. For Outlook/Thunderbird you should still see improvement, but not necessarily as much.
You didn't mention POP3. That isn't Dovecot's strong point. Its performance should be about the same as Courier-POP3, but could be less than QMail-POP3. Although if many of your POP3 users keep a lot of mails on server it
- If someone can give some technical reasoning behind why mdbox does less IOPS than Maildir?
Maildir renames files a lot. From new/ -> to cur/ and then every time message flag changes. That's why sdbox is faster. Why mdbox should be faster than sdbox is because mdbox puts (or should put) more mail data physically closer in disks to make reading it faster.
I understand some of the reasons for the mdbox IOPS question, but I need some more information so we can discuss internally and make a decision as to whether we're comfortable going with mdbox from day one. We're very familiar with Maidlir, and there's just some uneasiness internally around going to a new mail storage format.
It's at least safer to first switch to Dovecot+Maildir to make sure that any problems you might find aren't related to the mailbox format..