Quoting Timo Sirainen <tss@iki.fi>:
On 24 Sep 2015, at 16:26, Rick Romero <rick@havokmon.com> wrote:
Update. Only a single reboot has occurred since changing defalt_vsz_limit from 384M to 512M. It would seem that something the users are doing is causing that virtual memory size to be exceeded (possibly a mailbox search?), and when that occurs Dovecot/FreeBSD is not handling the event as smoothly as expected.
I could maybe understand that a system might reboot in some conditions when it runs out of memory, but you're doing the exact opposite of avoiding that by increasing the vsz limit. It just means that the system is potentially going to use even more memory. And wouldn't FreeBSD have something similar to Linux's out-of-memory killer? I think either your hardware is broken or FreeBSD has some serious bug, and a hardware problem seems more likely to me.
I was thinking along the lines of the process kill handling (? I don't know what actually occurs when the limit is reached - I'm assuming a thread is terminated) was triggering something odd in FreeBSD.
Activity/Usage has increased with the frequency of reboots, at least until I changed that parameter. User acitivty is the same, reboots have decreased dramatically (Just 1 since the change 9/14). While I wouldn't rule it out, it seems to me that it's less likely to be a hardware problem if I've simply provided MORE memory to a process (or set of processes) to avoid the issue...
This is my current 'top' output. It's not that the system is actually running low on memory, so I had no heisitation increasing the vsz limit. last pid: 59072; load averages: 0.30, 0.29, 0.32 up 7+03:02:34 14:27:59 1265 processes:1 running, 1264 sleeping CPU: 2.6% user, 0.0% nice, 1.4% system, 0.2% interrupt, 95.9% idle Mem: 3326M Active, 2210M Inact, 25G Wired, 8828K Cache, 1655M Buf, 1000M Free ARC: 20G Total, 14G MFU, 4646M MRU, 3845K Anon, 621M Header, 1216M Other Swap: 4096M Total, 4096M Free
Now, it's entirely possible that the user(s) who were eating all my server resources stopped using the system at the same time I increased the vsz limit, but that seems unlikely.
I'm leaning towards a FreeBSD issue of some sort - but I thought this might be a more approprate place as I have no hard data, I'm not sure what other software might use a similar vsz limit process/check that could trigger the oddity, and I just wanted it documented somewhere. :) Rick