I posted a few days back asking about configuration issues with a modestly large number of IMAP connections. Several people were kind enough to respond with various ideas. Armed with those ideas and Google, I was able to determine the underlying configuration issues with CentOS 7 and Dovecot 2.2.10. I did some further benchmarking to ensure that we could properly plan our server configuration requirements as we continue to roll this out, so I thought I would share this, partly to see if anyone has experiences that conflict with my findings.
I found that each incremental dovecot IMAP process required about 750k of RAM. Simulations confirmed that 8 gig of RAM was sufficient to support approximately 10,000 concurrent IMAP connections (no POP3).
CentOS settings:
Added: fs/inotify/max_user_instances = 28000 to: /etc/sysctl.d/99-sysctl.conf
Added: * hard nproc 28000 * soft nproc 28000 * hard nofile 28000 * soft nofile 28000 to: /etc/security/limits.d/20-nproc.conf To verify the settings after reboot: cat /proc/sys/fs/inotify/max_user_instances ulimit -Hn ulimit -Sn
Dovecot: Added: process_limit = 28000 to service imap{} /etc/dovecot/conf.d/10-master.conf Added (using proxies, would like to just turn off this check): mail_max_userip_connections = 40 to protocol imap {}: /etc/dovecot/conf.d/20-imap.conf Added: process_limit = 28000 to service managesieve {}: /etc/dovecot/conf.d/20-managesieve.conf
Hope this helps someone. Feedback would be welcome. It would be great if there was a blessed cookbook for Dovecot scaling. At this point it seems a bit like an art form.
Now to my two "easy" questions.
TCP replication between our two CentOS boxes has been working great, no complaints. However, I realized I did not know how to check the status of replications, as one might want to do if one of the two boxes was taken offline for maintenance or repair. On the surface, it would appear that the command: doveadm replicator status '*' would likely show me what I want to know. However, when I tried it, I got: doveadm(): Fatal: net_connect_unix(/var/run/dovecot/replicator-doveadm) failed: No such file or directory The documentation says that doveadm assumes the socket /var/run/dovecot/replicator-doveadm and the command format provides a "-a" override. In that directory, there is no "replicator-doveadm" socket, but I do see a "replicator" socket. So, should I be using the command: doveadm replicator status -a /var/run/dovecot/replicator '*' or is the non-existence of the replicator-doveadm socket indicative of something I might have done wrong with the config? I hate experimenting more than I have to with a production box. The 2.2.10 Dovecot I am running was installed via yum, so there are no potential compilation issues.
It seems logical to me that: dovecot stop would first lock out any new user connections, then do the equivalent of a: doveadm kick "*" before actually cycling everything else down to ensure the shutdown is as graceful as possible. I suppose I could experiment and find out for certain, but I have the sense that the "stop" command is not quite that elegant, so we have implemented procedures to work around it. Just a point of curiosity. An alternative would be if there was a doveadm command to lock out any new user connections, which could then be followed by kick and stop. I have found that many clients are VERY quick to reconnect after a kick.