Ed W put forth on 6/23/2010 4:18 PM:
Secondly 7,500 mails over 5 mins means an indexing rate of 25 mails/sec. This would not be out of order for a heavily fragmented drive which is IO bound? Each file needs to be opened to scan the headers so likely you need one disk seek and I guess it's easy to be IO bound? What does iotop show you during dovecot's thrashing?
It seems he is I/O bound a degree. If I read his answer to my disk subsystem question correctly he's storing user maildirs on a single local 1TB SATA drive. However, given that his 2nd successive login is 4-5 seconds instead of 5 minutes, it would seem index and cache being current are the problem, not I/O saturation. Faster disk would always help, but it's not close to a total solution to his problem.
Putting his maildir on a 16 disk RAID 0 stripe of the same model 1TB disk he already has would yield a 16x improvement in seek throughput, cutting his 'stale' login time to ~20 seconds, if my math is correct. 20 seconds is still unacceptably high IMHO, though it's much better than 300 seconds. Ok so lets assume the filesystem underlying his maildir is heavily fragmented and that defragging it would yield a 100% improvement for argument sake (50% improvement is almost unheard of, normal is about 20%). He'd still be looking at a 10 second login time after spending anywhere from $3k to $8k USD on a 16 disk array depending on what vendor he chooses. Throwing money and hardware at this problem isn't the proper or optimal solution.
Dovecot2 has an mdbox option which sounds like it could be beneficial for your performance requirements (but it's not "stable" yet)
I don't think migrating to a new mailbox format is what he really needs, or wants. IIRC he's been using qmail for years, which means he's been using maildir for years. He's likely very comfortable with maildir and probably wants to stick with it.
Otherwise I guess you need to investigate dovecot's delivery agent which does incremental index updates at deliver time (pay the cost one email at a time rather than every 7,500 emails). Or consider alternative filesystems which are more performant for this requirement?
On this I completely agree with you, and I suggested it previously in the same thread I suggested dirty_syncs. This should be his next step. If LDA alone doesn't fix the problem, then he should try dirty_syncs again in conjunction with LDA.
-- Stan