[Dovecot] file descriptor leak?
We are having problems with the latest Dovecot 1.0 (RC15) regarding file descriptors > 255 being handed off to fdopen() via crypt() via PAM(?). This is a problem on Solaris 10 due to the fact that the stdio library does not support file descriptors > 255 (in order to remain binary-compatibile with older binaries). Here are the messages we run into if we get too many (~250?) concurrent IMAP/POP sessions:
Jan 3 13:03:14 hostname dovecot-auth[5799]: crypt: fdopen(265) failed: Too many open files Jan 3 13:03:14 hostname dovecot-auth[5799]: crypt: fdopen(265) failed: Too many open files Jan 3 13:03:15 hostname dovecot: auth(default): pam(username,10.1.1.1): Child process died Jan 3 13:03:15 hostname dovecot: auth(default): PAM: Child 5799 died with signal 11
Questions:
Is there a file descriptor leak, or are there supposed to be this many open pipes in dovecot-auth? (master dovecot has way more than 256 at this point, but apparently does not use stdio.)
Has anyone tried compiling Dovecot in 64-bit mode (-xarch=v9)? I have tested the fact that this removes the >255 fd limitation in Solaris 9/10.
Has anyone even run into this problem on Solaris? I imagine anyone with more than 250 or so simultaneous IMAP/POP sessions would have had to run into it by now.
Anyone have any suggested workarounds (besides compiling in 64-bit mode)?
We have upwards of 4500 simultaneous IMAP connections alone on some of our servers (running UW IMAP still), so we obviously need to address this problem before we can fully deploy dovecot in our environment.
--
Steven F. Siirila Office: Lind Hall, Room 130B Internet Services E-mail: sfs@umn.edu Office of Information Technology Voice: (612) 626-0244 University of Minnesota Fax: (612) 626-7593
On 3.1.2007, at 22.54, Steven F Siirila wrote:
Jan 3 13:03:14 hostname dovecot-auth[5799]: crypt: fdopen(265)
failed: Too many open files Jan 3 13:03:14 hostname dovecot-auth[5799]: crypt: fdopen(265)
failed: Too many open files Jan 3 13:03:15 hostname dovecot: auth(default): pam(username, 10.1.1.1): Child process died Jan 3 13:03:15 hostname dovecot: auth(default): PAM: Child 5799
died with signal 11Questions:
- Is there a file descriptor leak, or are there supposed to be
this many open pipes in dovecot-auth? (master dovecot has way more than
256 at this point, but apparently does not use stdio.)
Each imap-login and pop3-login connects to dovecot-auth. So if you've
about 250 SSL/TLS connections, or 250 users logging in at the same
time, and login_process_per_connection=yes, I guess this could
happen. So login_process_per_connection=no should work around this.
- Has anyone tried compiling Dovecot in 64-bit mode (-xarch=v9)?
I have tested the fact that this removes the >255 fd limitation in
Solaris 9/10.
A lot of people are using 64bit Dovecot at least with x86-64.
I don't see why crypt() want to open any files though.
On Wed, Jan 03, 2007 at 11:15:01PM +0200, Timo Sirainen wrote:
On 3.1.2007, at 22.54, Steven F Siirila wrote:
Jan 3 13:03:14 hostname dovecot-auth[5799]: crypt: fdopen(265)
failed: Too many open files Jan 3 13:03:14 hostname dovecot-auth[5799]: crypt: fdopen(265)
failed: Too many open files Jan 3 13:03:15 hostname dovecot: auth(default): pam(username, 10.1.1.1): Child process died Jan 3 13:03:15 hostname dovecot: auth(default): PAM: Child 5799
died with signal 11Questions:
- Is there a file descriptor leak, or are there supposed to be
this many open pipes in dovecot-auth? (master dovecot has way more than
256 at this point, but apparently does not use stdio.)Each imap-login and pop3-login connects to dovecot-auth. So if you've
about 250 SSL/TLS connections, or 250 users logging in at the same
time, and login_process_per_connection=yes, I guess this could
happen. So login_process_per_connection=no should work around this.
First off, we don't allow non-SSL/TLS connections. When you say "I guess this could happen" are you saying that there might be a file descriptor leak? Is it normal to have hundreds of file descriptors in used by the master dovecot and the dovecot-auth process? What is the formula for how many file descriptors I SHOULD be seeing in use concurrently for master dovecot, dovecot-auth, etc.?
I will try switching to login_process_per_connection=no, hoping that the problem with file descriptors doesn't move from dovecot-auth to imap-login !
- Has anyone tried compiling Dovecot in 64-bit mode (-xarch=v9)?
I have tested the fact that this removes the >255 fd limitation in
Solaris 9/10.A lot of people are using 64bit Dovecot at least with x86-64.
I don't see why crypt() want to open any files though.
Me either. Doesn't the error message imply that crypt is calling fdopen?
--
Steven F. Siirila Office: Lind Hall, Room 130B Internet Services E-mail: sfs@umn.edu Office of Information Technology Voice: (612) 626-0244 University of Minnesota Fax: (612) 626-7593
On 4.1.2007, at 0.34, Steven F Siirila wrote:
Each imap-login and pop3-login connects to dovecot-auth. So if you've about 250 SSL/TLS connections, or 250 users logging in at the same time, and login_process_per_connection=yes, I guess this could happen. So login_process_per_connection=no should work around this.
First off, we don't allow non-SSL/TLS connections. When you say "I guess this could happen" are you saying that there
might be a file descriptor leak? Is it normal to have hundreds of file
descriptors in used by the master dovecot and the dovecot-auth process? What
is the formula for how many file descriptors I SHOULD be seeing in use
concurrently for master dovecot, dovecot-auth, etc.?
Each child process has a log output pipe open to master process.
Each imap-login and pop3-login process has an UNIX socket opened to
dovecot-auth process. After user has logged in, the process is only
proxying the SSL/TLS connections. After that it doesn't really need
to have the socket open for dovecot-auth, but currently it does.. I
hadn't thought about this before. This patch should fix it:
http://dovecot.org/list/dovecot-cvs/2007-January/007326.html
I will try switching to login_process_per_connection=no, hoping
that the problem with file descriptors doesn't move from dovecot-auth to
imap-login !
If you do that, you should also increase login_processes_count.
I don't see why crypt() want to open any files though.
Me either. Doesn't the error message imply that crypt is calling
fdopen?
Yep. Maybe it's connecting to some daemon that handles the crypting.
Or something..
On Thu, Jan 04, 2007 at 01:00:13AM +0200, Timo Sirainen wrote:
On 4.1.2007, at 0.34, Steven F Siirila wrote:
Each imap-login and pop3-login connects to dovecot-auth. So if you've about 250 SSL/TLS connections, or 250 users logging in at the same time, and login_process_per_connection=yes, I guess this could happen. So login_process_per_connection=no should work around this.
First off, we don't allow non-SSL/TLS connections. When you say "I guess this could happen" are you saying that there
might be a file descriptor leak? Is it normal to have hundreds of file
descriptors in used by the master dovecot and the dovecot-auth process? What
is the formula for how many file descriptors I SHOULD be seeing in use
concurrently for master dovecot, dovecot-auth, etc.?Each child process has a log output pipe open to master process.
That explains the large number of file descriptors in the master process. We have no issues with that process having large fds, only the auth process (due to the crypt() call occurring when fds in use > 256).
Each imap-login and pop3-login process has an UNIX socket opened to
dovecot-auth process. After user has logged in, the process is only
proxying the SSL/TLS connections. After that it doesn't really need
to have the socket open for dovecot-auth, but currently it does.. I
hadn't thought about this before. This patch should fix it:http://dovecot.org/list/dovecot-cvs/2007-January/007326.html
I am anxious to get this patch installed; however, if you are releasing RC16 "real soon now", I may wait for that instead. Any idea?
I will try switching to login_process_per_connection=no, hoping
that the problem with file descriptors doesn't move from dovecot-auth to
imap-login !If you do that, you should also increase login_processes_count.
Indeed.
I don't see why crypt() want to open any files though.
Me either. Doesn't the error message imply that crypt is calling
fdopen?Yep. Maybe it's connecting to some daemon that handles the crypting.
Or something..
Could be the driver for hardware crypto (this is a Sun T2000). There is a daemon running on the system that could explain this:
daemon 142 1 0 Dec 21 ? 0:29 /usr/lib/crypto/kcfd
--
Steven F. Siirila Office: Lind Hall, Room 130B Internet Services E-mail: sfs@umn.edu Office of Information Technology Voice: (612) 626-0244 University of Minnesota Fax: (612) 626-7593
participants (2)
-
Steven F Siirila
-
Timo Sirainen