dovecot 2.3.7.2-1~bionic: Performance issues caused by excessive IO to ~/mdbox/mailboxes/INBOX/dbox-Mails/dovecot.index.tmp
I set up system copying all mails to a backup system.
This used to work without a hitch - now in the last few days mails would pile up in the Postfix Queue, waiting to be delivered using the lmtp transport into dovecot.
So dovecot was being slow, but why? After all, nothing changed.
After reading some articles on stackoverflow I found a way of finding out which file gets the most IO:
% sysdig -c topfiles_bytes;
This command quickly pointed to ~/mdbox/mailboxes/INBOX/dbox-Mails/dovecot.index.tmp That file was written excessively.
I then put ~/mdbox/mailboxes/INBOX/dbox-Mails/ into tmpfs and alas, the queue would drain quickly.
But why is that? Why would the index file be updated so often?
This is dovecot 2.3.7.2-1~bionic
# 2.3.7.2 (3c910f64b): /etc/dovecot/dovecot.conf # OS: Linux 5.0.0-29-generic x86_64 Ubuntu 18.04.3 LTS default_vsz_limit = 2 G lmtp_user_concurrency_limit = 10000 mail_attachment_dir = /home/copymail/attachments mail_attachment_hash = %{sha256} mail_fsync = never mail_location = mdbox:~/mdbox mail_plugins = zlib fts fts_lucene mdbox_rotate_size = 128 M namespace inbox { inbox = yes location = mailbox Drafts { special_use = \Drafts } mailbox Junk { special_use = \Junk } mailbox Sent { special_use = \Sent } mailbox "Sent Messages" { special_use = \Sent } mailbox Trash { special_use = \Trash } prefix = } passdb { args = username_format=%u /etc/dovecot/passwd driver = passwd-file } plugin { fts = lucene fts_languages = de,en fts_lucene = whitespace_chars=@. } protocols = " imap lmtp" service imap-login { inet_listener imap { address = 127.0.0.1 port = 143 } inet_listener imaps { port = 993 ssl = yes } } service lmtp { inet_listener lmtp { address = 10.0.5.208 port = 1025 } process_min_avail = 5 unix_listener /var/spool/postfix/private/dovecot-lmtp { group = postfix mode = 0660 user = postfix } } ssl_ca = /etc/ssl/certs/ca-certificates.crt ssl_cert = </etc/ssl/certs/mail-cbf.pem ssl_cipher_list = EECDH+ECDSA+AESGCM:EECDH+aRSA+AESGCM:EECDH+ECDSA+SHA384:EECDH+ECDSA+SHA256:EECDH+aRSA+SHA384:EECDH+aRSA+SHA256:EECDH+aRSA+RC4:EECDH:EDH+aRSA:!aNULL:!eNULL:!LOW:!3DES:!MD5:!EXP:!PSK:!SRP:!DSS:!RC4 ssl_dh = # hidden, use -P to show it ssl_key = # hidden, use -P to show it ssl_prefer_server_ciphers = yes userdb { args = username_format=%u /etc/dovecot/passwd driver = passwd-file } protocol lmtp { mail_plugins = zlib fts fts_lucene }
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | https://www.charite.de
- Ralf Hildebrandt via dovecot <dovecot@dovecot.org>:
But why is that? Why would the index file be updated so often?
BTW: This post is a followup to my "2.3.7 slower than 2.3.6?" post from back in July.
On 1 Oct 2019, at 16.45, Ralf Hildebrandt via dovecot <dovecot@dovecot.org> wrote:
- Ralf Hildebrandt via dovecot <dovecot@dovecot.org>:
But why is that? Why would the index file be updated so often?
BTW: This post is a followup to my "2.3.7 slower than 2.3.6?" post from back in July.
Fixed by https://github.com/dovecot/core/commit/5e9e09a041b318025fd52db2df25052b60d0f... <https://github.com/dovecot/core/commit/5e9e09a041b318025fd52db2df25052b60d0fc98> and will be in the soon-to-be-released v2.3.8.
- Timo Sirainen <timo@sirainen.com>:
But why is that? Why would the index file be updated so often?
BTW: This post is a followup to my "2.3.7 slower than 2.3.6?" post from back in July.
Fixed by https://github.com/dovecot/core/commit/5e9e09a041b318025fd52db2df25052b60d0f... <https://github.com/dovecot/core/commit/5e9e09a041b318025fd52db2df25052b60d0fc98> and will be in the soon-to-be-released v2.3.8.
Thanks. I will test this (once it's been released) by moving the index files back to conventional storage.
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | https://www.charite.de
- Timo Sirainen <timo@sirainen.com>:
BTW: This post is a followup to my "2.3.7 slower than 2.3.6?" post from back in July.
Fixed by https://github.com/dovecot/core/commit/5e9e09a041b318025fd52db2df25052b60d0f... and will be in the soon-to-be-released v2.3.8.
I stopped 2.3.7, copied over the index files from the ramdisk into the physical "realm" and restarted with a fresh 2.3.8. It probably takes a few days to be absolutely sure.
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | https://www.charite.de
- Ralf Hildebrandt via dovecot <dovecot@dovecot.org>:
- Timo Sirainen <timo@sirainen.com>:
BTW: This post is a followup to my "2.3.7 slower than 2.3.6?" post from back in July.
Fixed by https://github.com/dovecot/core/commit/5e9e09a041b318025fd52db2df25052b60d0f... and will be in the soon-to-be-released v2.3.8.
I stopped 2.3.7, copied over the index files from the ramdisk into the physical "realm" and restarted with a fresh 2.3.8. It probably takes a few days to be absolutely sure.
So, in general the performance issues are gone.
But...
I'm seeing odd hourly spikes almost every hour, on the hour. You might say: Well yes, that's a cronjob sending lots of mails. But it isn't. There's not more or less mail coming in at that very moment.
I suspect something in dovecot running every hour (DH key regeneration?)
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | https://www.charite.de
On 16/10/2019 13:31 Ralf Hildebrandt via dovecot <dovecot@dovecot.org> wrote:
- Ralf Hildebrandt via dovecot <dovecot@dovecot.org>:
- Timo Sirainen <timo@sirainen.com>:
BTW: This post is a followup to my "2.3.7 slower than 2.3.6?" post from back in July.
Fixed by https://github.com/dovecot/core/commit/5e9e09a041b318025fd52db2df25052b60d0f... and will be in the soon-to-be-released v2.3.8.
I stopped 2.3.7, copied over the index files from the ramdisk into the physical "realm" and restarted with a fresh 2.3.8. It probably takes a few days to be absolutely sure.
So, in general the performance issues are gone.
But...
I'm seeing odd hourly spikes almost every hour, on the hour. You might say: Well yes, that's a cronjob sending lots of mails. But it isn't. There's not more or less mail coming in at that very moment.
I suspect something in dovecot running every hour (DH key regeneration?)
-- Ralf Hildebrandt
2.3.7 does not generate DH keys. It's been removed since 2.3.0
Is it possible for you to track and find out which process is causing the peak?
Aki
- Aki Tuomi via dovecot <dovecot@dovecot.org>:
2.3.7 does not generate DH keys. It's been removed since 2.3.0
Yes, it was the only periodic process I could think/knew of.
Is it possible for you to track and find out which process is causing the peak?
Will try. Next hour :)
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | https://www.charite.de
On 1 Oct 2019, at 16.31, Ralf Hildebrandt via dovecot <dovecot@dovecot.org> wrote:
I set up system copying all mails to a backup system.
This used to work without a hitch - now in the last few days mails would pile up in the Postfix Queue, waiting to be delivered using the lmtp transport into dovecot.
So dovecot was being slow, but why? After all, nothing changed.
After reading some articles on stackoverflow I found a way of finding out which file gets the most IO:
% sysdig -c topfiles_bytes;
This command quickly pointed to ~/mdbox/mailboxes/INBOX/dbox-Mails/dovecot.index.tmp That file was written excessively.
Was it one user's dovecot.index.tmp or for a lot of users? This means that dovecot.index is being rewritten, which should happen only once in a while, but now it sounds like it's happening maybe for every mail delivery. If it's still happening, could you send me one folder's dovecot.index and dovecot.index.log files? (They don't contain anything sensitive other than maybe message flags.)
I then put ~/mdbox/mailboxes/INBOX/dbox-Mails/ into tmpfs and alas, the queue would drain quickly.
But why is that? Why would the index file be updated so often?
This is dovecot 2.3.7.2-1~bionic
So you had been running this version already for a while, and then it just suddenly started getting slow?
I tried to reproduce this with imaptest and Dovecot that is patched to log when dovecot.index is being rewritten, but there doesn't seem to be any difference with v2.2.36, v2.3.7 or git master.
This command quickly pointed to ~/mdbox/mailboxes/INBOX/dbox-Mails/dovecot.index.tmp That file was written excessively.
Was it one user's dovecot.index.tmp or for a lot of users?
There's just one user. All mail goes to one mailbox.
This means that dovecot.index is being rewritten, which should happen only once in a while, but now it sounds like it's happening maybe for every mail delivery.
Yes, it seems to be that way.
If it's still happening, could you send me one folder's dovecot.index and dovecot.index.log files? (They don't contain anything sensitive other than maybe message flags.)
I can send the files.
This is dovecot 2.3.7.2-1~bionic
So you had been running this version already for a while, and then it just suddenly started getting slow?
Yes. Initially I threw away the whole mailbox after it got slow, but after a few days the performance started to degrade. Admittedly, it contains a lot of mails :)
I tried to reproduce this with imaptest and Dovecot that is patched to log when dovecot.index is being rewritten, but there doesn't seem to be any difference with v2.2.36, v2.3.7 or git master.
-- Ralf Hildebrandt Geschäftsbereich IT | Abteilung Netzwerk Charité - Universitätsmedizin Berlin Campus Benjamin Franklin Hindenburgdamm 30 | D-12203 Berlin Tel. +49 30 450 570 155 | Fax: +49 30 450 570 962 ralf.hildebrandt@charite.de | https://www.charite.de
participants (3)
-
Aki Tuomi
-
Ralf Hildebrandt
-
Timo Sirainen