System load spike on dovecot reload
Hi everyone,
I'm running dovecot with quite a lot of users and lots of active imap connections (like 20'000). I'm using different user IDs for users, so I need to have imap {service_count=1} - i.e. I have a lots of imap processes running.
Everything works fine, until I reload dovecot configuration. When that happen, every client is forced to relogin in the same time and that causes a huge system load spike (2-3000 5 min load).
I was thinking that it would be great, if dovecot wouldn't kick all the users in the same time during reload, but somehow gradually, during specified interval. I'm aware of the shutdown_clients directive that could help, but I don't like it - I do want the clients get disconnected on dovecot shutdown and also I want them to relogin in reasonably short time after reload.
Is something like that possible with dovecot or does it make sense to implement that in the future versions?
Thank you.
Dave.
On Apr 21, 2017, at 4:43 AM, dave@evilcigi.eu wrote:
Hi everyone,
I'm running dovecot with quite a lot of users and lots of active imap connections (like 20'000). I'm using different user IDs for users, so I need to have imap {service_count=1} - i.e. I have a lots of imap processes running.
Everything works fine, until I reload dovecot configuration. When that happen, every client is forced to relogin in the same time and that causes a huge system load spike (2-3000 5 min load).
I was thinking that it would be great, if dovecot wouldn't kick all the users in the same time during reload, but somehow gradually, during specified interval. I'm aware of the shutdown_clients directive that could help, but I don't like it - I do want the clients get disconnected on dovecot shutdown and also I want them to relogin in reasonably short time after reload.
You could run a Dovecot IMAP proxy in a Docker container on your server and run a separate Dovecot IMAP server in another container. Once both containers are up and running, enable the Dovecot IMAP proxy to start sending IMAP sessions to the IMAP server. When the time comes to change the Dovecot configuration, deploy another instance of Dovecot IMAP server with the new configuration. Once the new container is up and running, configure Dovecot IMAP proxy to direct a few specific test users to the new Dovecot IMAP server. When satisfied that the new server can handle new user sessions, configure Dovecot IMAP proxy to direct all new sessions to the new instance. After everything seems to be working fine for a period of time, start kicking users on the old Dovecot IMAP server off (at a comfortable pace) so they will reconnect to the new Dovecot IMAP server. When the old Dovecot IMAP server is no longer managing any sessions, it can be removed from the server (that is, the Docker container stopped and eventually removed completely).
Since all containers are running on the same host server, the old and new Dovecot containers will be configured to access the same Dovecot mail storage by mounting the host storage to both containers.
I think Docker containers are the easiest way to manage Dovecot in production.
Kevin
Hello,
On Fri, 21 Apr 2017 10:43:47 +0200 dave@evilcigi.eu wrote:
Hi everyone,
I'm running dovecot with quite a lot of users and lots of active imap connections (like 20'000). I'm using different user IDs for users, so I need to have imap {service_count=1} - i.e. I have a lots of imap processes running.
We peaked out at 65k imap processes before upgrading to a version where imap-hibernate more or less works, but we're using a common ID.
dovecot 119157 0.1 0.0 59364 52216 ? S Apr01 48:25 dovecot/imap-hibernate [15137 connections]
The service_count parameter in this context is not doing what you think it does, I have it at 200 these days and that will allow imap (or pop3) processes to be recycled (they are labeled with "idling" when waiting for a new client), not having one imap process serve multiple clients.
mail 591307 0.0 0.0 29876 4712 ? S Apr20 0:00 dovecot/imap [idling] mail 735323 0.0 0.0 27396 4196 ? S 13:20 0:00 dovecot/pop3 [idling]
The advantage (for me at least) is that the dovecot master process doesn't have to to spin up a new mail processes each time during logins.
Since this process is quite single-threaded, it becomes a bottleneck eventually.
Everything works fine, until I reload dovecot configuration. When that happen, every client is forced to relogin in the same time and that causes a huge system load spike (2-3000 5 min load).
Unless you're making a change that affects the dovecot master process, restarting everything isn't needed and you should set "shutdown_clients = no". You could still kick users with "dovecot kick" at a leisurely pace, but security problems with the mail processes are rare.
I was thinking that it would be great, if dovecot wouldn't kick all the users in the same time during reload, but somehow gradually, during specified interval. I'm aware of the shutdown_clients directive that could help, but I don't like it -
I've very much gotten to like it, once things got huge and busy.
I do want the clients get disconnected on dovecot shutdown and also I want them to relogin in reasonably short time after reload.
Is something like that possible with dovecot or does it make sense to implement that in the future versions?
Run a dovecot proxy (if you have single box with all these users on it, Mr. Murphy would like a word with you) and set "login_proxy_max_disconnect_delay" to something that suits you.
Christian
Christian Balzer Network/Systems Engineer
chibi@gol.com Global OnLine Japan/Rakuten Communications
http://www.gol.com/
Hi,
enabling 'login_proxy_max_disconnect_delay' on IMAP proxy did the trick. I should have mentioned I use proxy servers, sorry
Thanks,
Dave
Dne 22.4.2017 v 06:25 Christian Balzer napsal(a):
Hello,
On Fri, 21 Apr 2017 10:43:47 +0200 dave@evilcigi.eu wrote:
Hi everyone,
I'm running dovecot with quite a lot of users and lots of active imap connections (like 20'000). I'm using different user IDs for users, so I need to have imap {service_count=1} - i.e. I have a lots of imap processes running.
We peaked out at 65k imap processes before upgrading to a version where imap-hibernate more or less works, but we're using a common ID.
dovecot 119157 0.1 0.0 59364 52216 ? S Apr01 48:25 dovecot/imap-hibernate [15137 connections]
The service_count parameter in this context is not doing what you think it does, I have it at 200 these days and that will allow imap (or pop3) processes to be recycled (they are labeled with "idling" when waiting for a new client), not having one imap process serve multiple clients.
mail 591307 0.0 0.0 29876 4712 ? S Apr20 0:00 dovecot/imap [idling] mail 735323 0.0 0.0 27396 4196 ? S 13:20 0:00 dovecot/pop3 [idling]
The advantage (for me at least) is that the dovecot master process doesn't have to to spin up a new mail processes each time during logins.
Since this process is quite single-threaded, it becomes a bottleneck eventually.
Everything works fine, until I reload dovecot configuration. When that happen, every client is forced to relogin in the same time and that causes a huge system load spike (2-3000 5 min load).
Unless you're making a change that affects the dovecot master process, restarting everything isn't needed and you should set "shutdown_clients = no". You could still kick users with "dovecot kick" at a leisurely pace, but security problems with the mail processes are rare.
I was thinking that it would be great, if dovecot wouldn't kick all the users in the same time during reload, but somehow gradually, during specified interval. I'm aware of the shutdown_clients directive that could help, but I don't like it - I've very much gotten to like it, once things got huge and busy.
I do want the clients get disconnected on dovecot shutdown and also I want them to relogin in reasonably short time after reload.
Is something like that possible with dovecot or does it make sense to implement that in the future versions?
Run a dovecot proxy (if you have single box with all these users on it, Mr. Murphy would like a word with you) and set "login_proxy_max_disconnect_delay" to something that suits you.
Christian
On 04/21/2017 10:43 AM, dave@evilcigi.eu wrote:
Everything works fine, until I reload dovecot configuration. When that happen, every client is forced to relogin in the same time and that causes a huge system load spike (2-3000 5 min load).
A system load of 3000 means that an average of 3000 processes would like a bit of CPU time, not that the system's unable to *provide* that to them, if only the amount of CPU each needs is small enough. You want to look at the actual percentage of CPU used (and at the total number of processes, to see how close you might be to running out of PIDs).
I was thinking that it would be great, if dovecot wouldn't kick all the users in the same time during reload, but somehow gradually, during specified interval.
That wouldn't help unless the kicked clients, which will likely try to reconnect immediately, *can* reconnect at once; otherwise, at the end of the grace period, you'ld still have 100% of the clients banging on the doors to get back in simultaneously.
So "old" and "new" dovecot would need to run in parallel, with all the incompatibilities and resource restrictions *that* might entail, seeing that there's obviously *something* that changed and prompted you to try and restart dovecot in the first place.
In other words, I don't think that the dovecot code *can* cover all those cases on its own - providing two suitably separate environments (like with Kevin's proxy+docker suggestion) is IMHO the way to go here.
Regards,
Jochen Bern Systemingenieur
Fon: +49 6151 9067-231 Fax: +49 6151 9067-290 E-Mail: jochen.bern@binect.de
www.binect.de www.facebook.de/binect
Binect ist ausgezeichnet: Sieger INNOVATIONSPREIS-IT 2017 | Das Büro: Top 100 Büroprodukte 2017
Binect GmbH
Robert-Koch-Straße 9, 64331 Weiterstadt, DE
Geschäftsführung: Christian Ladner, Dr. Frank Wermeyer, Nils Manegold Unternehmenssitz: Weiterstadt Register: Amtsgericht Darmstadt, HRB 94685 Umsatzsteuer-ID: DE 221 302 264
MAX 21-Unternehmensgruppe ✁ Diese E-Mail kann vertrauliche Informationen enthalten. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren, sowie die unbefugte Weitergabe dieser Mail oder von Teilen dieser Mail ist nicht gestattet. Jede von der Binect GmbH versendete Mail ist sorgfältig erstellt worden, dennoch schließen wir die rechtliche Verbindlichkeit aus; sie kann nicht zu einer irgendwie gearteten Verpflichtung zu Lasten der Binect GmbH ausgelegt werden. Wir haben alle verkehrsüblichen Maßnahmen unternommen, um das Risiko der Verbreitung virenbefallener Software oder E-Mails zu minimieren, dennoch raten wir Ihnen, Ihre eigenen Virenkontrollen auf alle Anhänge an dieser Nachricht durchzuführen. Wir schließen, außer für den Fall von Vorsatz oder grober Fahrlässigkeit, die Haftung für jeglichen Verlust oder Schäden durch virenbefallene Software oder E-Mail aus.
This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of contents of this e-mail is strictly prohibited. All Binect GmbH emails are created thoroughly, nevertheless we do not accept any legal obligation for the information and wording contained herein. Binect GmbH has taken precautionary measures to reduce the risk of possible distribution of virus infected software or emails. However, we advise you to check attachments to this email for viruses. Except for cases of intent or gross negligence, we cannot accept any legal obligation for loss or damage by virus infected software.
participants (5)
-
Christian Balzer
-
dave@evilcigi.eu
-
dovecot@mtfbwy.cz
-
Jochen Bern
-
KT Walrus