On Tue, 29 Nov 2016 14:30:37 +0200 Timo Sirainen wrote:
On 29 Nov 2016, at 2.57, Christian Balzer chibi@gol.com wrote:
service imap { # Most of the memory goes to mmap()ing files. You may need to increase this # limit if you have huge mailboxes. #vsz_limit = $default_vsz_limit vsz_limit = 512M
# Max. number of IMAP processes (connections) #process_limit = 1024 process_limit = 524288 }
..
But adding a "service_count = 100" line (any value larger than 1 really) to the imap section we get the dreaded:
Nov 28 17:05:40 mbx09 dovecot: config: Warning: service auth { client_limit=16384} is lower than required under max. load (528384)
- Where's the difference in Dovecot's logic between a mail service that has a service count of 1 versus one with >1?
With service_count=1 it disconnects from auth immediately after logging in. With service_count>0 the auth connection is kept open for the entire existence of the imap process. This is mainly because after dropping privileges it wouldn't be able to connect to the auth-master socket again. In theory if the socket permissions were changed, it could keep reconnecting to auth-master and not keep connections open all the time.
Alright then, that's what I was suspecting. Too bad, but totally understandable.
- Any way to get the process recycling for IMAP going w/o setting the fd limit to a ridiculous amount?
How about shrinking the imap process_limit? I highly doubt you can actually run 500k imap processes per server and have it still working. The largest I've ever heard people running has been 50k processes per server.
Well, that's the limit these servers could theoretically take IOPS wise, other things like memory might curtail that earlier. Also this is the fail-over level of this cluster pair, a single node normally would only have to handle half of this.
Incidentally these 2 servers are currently running about 45k IMAP processes each and the most busy process is (unsurprisingly) dovecot, the master. But that's only using about 20% of one core and the system is currently operating with the on-demand CPU governor, so that core is only at half speed typically. It's a pure SSD system, I/O utilization tends to peak around 3% and averages less than 1%. Memory is still half "free" (page cache) and an upgrade of that is planned.
So from where I'm standing 100k per server (200k in fail-over) at the least should be achievable easily.
Guess cranking up the fd limit it is then, still got 10 million spares after all.
Thanks for the feedback,
Christian
Christian Balzer Network/Systems Engineer
chibi@gol.com Global OnLine Japan/Rakuten Communications
http://www.gol.com/