sudden performance drop - i/o related

Aki Tuomi aki.tuomi at open-xchange.com
Tue May 11 08:30:32 EEST 2021


> On 11/05/2021 01:07 Marcin Gryszkalis <mg at fork.pl> wrote:
> 
>  
> Hi
> I have exim/dovecot server that worked great for last few years and two 
> weeks ago it got ill ;)
> First were users reporting errors on saving mails to Sent (timeouts).
> Now the logs are infested with warnings about long waits:
> 
> May 10 10:18: Maildir /mail/xxx Synchronization took 193 seconds (1 new 
> msgs, 0 flag change attempts, 0 expunge attempts)
> May 10 10:18: Maildir /mail/xxx Synchronization took 125 seconds (1 new 
> msgs, 0 flag change attempts, 0 expunge attempts)
> May 10 10:18: Maildir /mail/xxx Synchronization took 211 seconds (1 new 
> msgs, 0 flag change attempts, 0 expunge attempts)
> May 10 10:18: Maildir /mail/xxx Synchronization took 107 seconds (8 new 
> msgs, 0 flag change attempts, 0 expunge attempts)
> May 10 10:18: Transaction log file /mail/xxx was locked for 36 seconds 
> (Mailbox was synchronized)
> May 10 10:18: Transaction log file /mail/xxx was locked for 160 seconds 
> (Mailbox was synchronized)
> May 10 10:18: Transaction log file /mail/xxx was locked for 72 seconds 
> (Mailbox was synchronized)
> May 10 10:18: Locking transaction log file /mail/xxx took 60 seconds 
> (syncing)
> May 10 10:18: Locking transaction log file /mail/xxx took 38 seconds 
> (syncing)
> May 10 10:18: Locking transaction log file /mail/xxx took 35 seconds 
> (syncing)
> 
> It looks like i/o risen from 150writes/s to 500writes/s (in top hours) - 
> but there's no real change in number of emails or the volume. Number of 
> users is steady (~100 active users, ~250 imap sessions), number of 
> emails (by count or by volume) is rising and falling within 15% margin.
> 
> The box is FreeBSD 11.4, dovecot is 2.3.13.
> Filesystem is ZFS, disks are fine, free space is around 20% (~200GB)
> Layout is Maildir. CPU is not overloaded (2x6core), same with memory (48GB).
> 
> I didn't change anything in configuration.
> 
> Tonight I did some finetuning like maildir_copy_with_hardlinks=yes or 
> mail_fsync=never/optimized (I'm not happy with that but I'm afraid it 
> won't really help and I'll be able to revert that). I'm also thinking 
> about switching from Maildir to sdbox (I know it won't hurt).
> 
> I don't know where to look to find where the i/o goes. I don't have any 
> metrics/stats enabled (I looked at the docs but it looks it's not really 
> simple and needs some digging to get valuable config). Maybe somebody 
> has suggestions what to look for?
> 
> For detailed per-process stats I need to rebuild kernel with dtrace 
> (other night I guess)... Simple top (in i/o mode - similar to linux's 
> iotop) doesn't catch short living processes (like LDA deliveries).
> 
> best regards
> -- 
> Marcin Gryszkalis, PGP 0xA5DBEEC7 http://fork.pl/gpg.txt

One thing that does come to mind is that you are delivering outside dovecot. Without knowing your system better, I would suggest that one thing to try would be to use dovecot-lda to deliver mail.

Are your users directly accessing the maildir?

Aki


More information about the dovecot mailing list