On Mon, 2010-11-08 at 08:34 +0100, Ralf Hildebrandt wrote:
I'm getting constantly high numbers of page reclaims & involuntary context switches for dovecot/auth.
page reclaims = minor faults = cpu switching back to system-mode, But why is the auth process doing that so excessively? Same for the large number of involuntary context switches...
Hmm. "A page reclaim occurs when a requested page exists on the free list. A page reclaim results in a page fault being satisfied in memory."
Date: Mon, 08 Nov 2010 01:00:01 +0100
type real user sys reclaim faults swaps bin bout signals volcs involcs auth 421.98 1.32 1.66 24216 0 0 0 0 0 267 723 managesieve-lo 18616.9 86.77 32.49 319768 1 0 168 0 0 63448 48838
The managesieve-login is pretty high here too. Much worse than auth process. Were there tons of logins at that time? Or some brute force password guessing, or some other DoSing? Perhaps the problem is actually managesieve-login alone? Did you set also service managesieve-login { service_count=0 } ?
I'm currently seeing in my test machine where imaptest is runnng something like:
type real user sys recla faults swaps bin bout signals volcs involcs master 1252.14 0.58 2.70 138271 0 0 0 16 0 30101 69 anvil 1252.13 0.27 0.23 336 0 0 0 0 0 16739 6 imap 2.34 0.11 0.20 1548 0 0 0 856 0 22 144 imap-lo 0.11 0.00 0.80 622 0 0 0 0 0 6 19 auth 1248.45 1.26 0.91 841 16 0 3248 0 0 51559 118 log 1252.84 0.86 1.12 347 0 0 8 4560 0 47245 27 config 1252.12 13.57 0.59 1061 0 0 0 0 0 36574 727 lmtp 41.43 0.40 0.80 495 11 0 2328 0 0 25 14
The config process's high user CPU% is expected. Master is doing a lot of page reclaims, which I'd guess is because it's forking a lot.