On 12/15/06, Timo Sirainen tss@iki.fi wrote:
On Fri, 2006-12-15 at 10:05 -0600, falz wrote:
I'm using quotas, and by default had it set to 'dirsize'. my postfix is already creating 'maildirsize' files, so I opted to change dovecot to use this instead.
However, when I do, I run into the "too many open files" issue that's been reported many times on the list.
So you tried something like "quota=maildir:storage=123456"?
That's exactly. the exact lines I toggle between are:
quota = dirsize:storage=102400 quota = maildir:storage=102400
When it's set to maildir, the openfiles go up at an incredible rate.
Anyhow. I doubled my max open files in FreeBSD's /boot/loader.conf to: kern.maxfiles="24656", but to no avail. When I have it set to maildirsize, the open files on the system grows uncontrollably.
How quickly? By doing what exactly? Just logging in, opening a mailbox, saving a new mail, ..?
It's on a live system, probably with at least 20 imap connections (imap proxy is running on another server, it's handling more users than that). So, simply toggling this setting, reloading dovecot. probably takes 4 or so minutes for it to go from a few thousand to 20k.
Any thoughts? TSS, you mention a few times in these notes that you're unable to reproduce this. Perhaps try with the maildir as quota?
The previous problems seemed to have been only with NetBSD 2.0 and kqueue enabled, and I think it's a NetBSD bug.
I saw some older threads, in 2004 and 2005. Unsure if was completely related or not.
I looked through the maildir quota code, and I do actually see one fd leak in there, but it happens only when deinitializing the quota, so practically it doesn't matter.
You sent a lsof of the dovecot process, but assuming you hadn't dropped out thousands of lines from it, it probably wasn't the problematic process. Quota code is handled by imap process, so check that instead, and check also that lsof is returning thousands of lines. If it isn't, then it's something else that's eating the fds..
I can't seem to successfully get a full lsof piped to a file when i'm running quota's with the maildir type- I'm assuming because the # of open files are growing until it hits the system limit.
If I can, I'll see if I can lock the system down and deny logins to all except myself, do a test login with only one user, and still see if it happens.
I should also note that here's what I've got for some of the speed tweaking things in dovecot. Some of this was just guesswork, going on what the wiki and some mailing lists say.
login_process_size = 32 login_process_per_connection = no login_processes_count = 15 login_max_processes_count = 512 login_max_connections = 1024
For the hell of it, I changed per_connection back to 'yes' (I had only changed it because docs say it's faster that way, which is what I'm shooting for). The same thing still seems to happen- it was in the 2k range for open files prior to this (right now = not many users), and after only about 2 minutes it's already at 5k, and never going down.
Any other debugging info you're looking for, other than lsof output? Is there any way that you can think of that I can find how many active login, auth, or whatever else processes are actively happening at one time?
thanks --falz