sudden performance drop - i/o related
Aki Tuomi
aki.tuomi at open-xchange.com
Tue May 11 08:30:32 EEST 2021
> On 11/05/2021 01:07 Marcin Gryszkalis <mg at fork.pl> wrote:
>
>
> Hi
> I have exim/dovecot server that worked great for last few years and two
> weeks ago it got ill ;)
> First were users reporting errors on saving mails to Sent (timeouts).
> Now the logs are infested with warnings about long waits:
>
> May 10 10:18: Maildir /mail/xxx Synchronization took 193 seconds (1 new
> msgs, 0 flag change attempts, 0 expunge attempts)
> May 10 10:18: Maildir /mail/xxx Synchronization took 125 seconds (1 new
> msgs, 0 flag change attempts, 0 expunge attempts)
> May 10 10:18: Maildir /mail/xxx Synchronization took 211 seconds (1 new
> msgs, 0 flag change attempts, 0 expunge attempts)
> May 10 10:18: Maildir /mail/xxx Synchronization took 107 seconds (8 new
> msgs, 0 flag change attempts, 0 expunge attempts)
> May 10 10:18: Transaction log file /mail/xxx was locked for 36 seconds
> (Mailbox was synchronized)
> May 10 10:18: Transaction log file /mail/xxx was locked for 160 seconds
> (Mailbox was synchronized)
> May 10 10:18: Transaction log file /mail/xxx was locked for 72 seconds
> (Mailbox was synchronized)
> May 10 10:18: Locking transaction log file /mail/xxx took 60 seconds
> (syncing)
> May 10 10:18: Locking transaction log file /mail/xxx took 38 seconds
> (syncing)
> May 10 10:18: Locking transaction log file /mail/xxx took 35 seconds
> (syncing)
>
> It looks like i/o risen from 150writes/s to 500writes/s (in top hours) -
> but there's no real change in number of emails or the volume. Number of
> users is steady (~100 active users, ~250 imap sessions), number of
> emails (by count or by volume) is rising and falling within 15% margin.
>
> The box is FreeBSD 11.4, dovecot is 2.3.13.
> Filesystem is ZFS, disks are fine, free space is around 20% (~200GB)
> Layout is Maildir. CPU is not overloaded (2x6core), same with memory (48GB).
>
> I didn't change anything in configuration.
>
> Tonight I did some finetuning like maildir_copy_with_hardlinks=yes or
> mail_fsync=never/optimized (I'm not happy with that but I'm afraid it
> won't really help and I'll be able to revert that). I'm also thinking
> about switching from Maildir to sdbox (I know it won't hurt).
>
> I don't know where to look to find where the i/o goes. I don't have any
> metrics/stats enabled (I looked at the docs but it looks it's not really
> simple and needs some digging to get valuable config). Maybe somebody
> has suggestions what to look for?
>
> For detailed per-process stats I need to rebuild kernel with dtrace
> (other night I guess)... Simple top (in i/o mode - similar to linux's
> iotop) doesn't catch short living processes (like LDA deliveries).
>
> best regards
> --
> Marcin Gryszkalis, PGP 0xA5DBEEC7 http://fork.pl/gpg.txt
One thing that does come to mind is that you are delivering outside dovecot. Without knowing your system better, I would suggest that one thing to try would be to use dovecot-lda to deliver mail.
Are your users directly accessing the maildir?
Aki
More information about the dovecot
mailing list