[Dovecot] PostgreSQL connection bug
Hello,
There seems to be a bug which can affect systems using PostgreSQL as a user or password database.
Usually there is just one postgres connection, but if the query fails for any reason, the failed connection is left open and dovecot will keep opening new connections for each auth attempt.
I discovered this with a setup using PAM, AND a database to authenticate. Most users login with an account number (integer) which is looked up in the postgres database. Some users have a unix system account, and use their system username to log in (string). Dovecot was set up to check the database first, then try PAM.
When a system user logged in, the SQL query failed because it couldn't coerce the system user's username into an integer.
In these cases, Dovecot would still fall back to PAM authentication, and the user would log in, but the postgres connection would remain stuck. Gradually the number of connections would increase until postgres failed.
I worked around this by converting the account numbers in the database from integers to strings. Now the query never fails and there's only ever 1 dovecot connection. Hopefully this will help someone.
dovecot --version 1.0.15
dovecot -n # 1.0.15: /etc/dovecot/dovecot.conf log_timestamp: %Y-%m-%d %H:%M:%S login_dir: /var/run/dovecot/login login_executable: /usr/lib/dovecot/imap-login mail_privileged_group: mail mail_location: maildir:~/Maildir auth default: passdb: driver: sql args: /etc/dovecot/dovecot-sql.conf passdb: driver: pam userdb: driver: passwd userdb: driver: prefetch
cat dovecot-sql.conf | egrep -v "^#"
driver = pgsql connect = host=localhost dbname=rapidgroup user=dovecot password=********* default_pass_scheme = PLAIN password_query = SELECT userid as user, password, userdb_mail, userdb_uid, userdb_gid FROM dovecotview WHERE userid='%u'
Cheers, Dan
On Sun, 2009-11-29 at 06:00 +0000, Daniel Howard wrote:
In these cases, Dovecot would still fall back to PAM authentication, and the user would log in, but the postgres connection would remain stuck. Gradually the number of connections would increase until postgres failed.
This sounds weird. You mean if password_query doesn't return any results Dovecot somehow completely hangs/leaks a PostgreSQL connection and creates a new one? Did it log any errors?
We run a mail service with some hundreds of domains and thousands of
users. Xeon 2 CPU server hosting the imap, pop, webmail, db and all
trimmings (sepparate servers scanning for spam/viri)
The server was io saturated and performing poorly, especially for
large webmail users. A constant queue of hundreds of ops per second
were taking too many miliseconds each (gstat.) Load average often
spiked over 15! We planned to add a second backend imap/pop machine
but first migrated to dovecot (from courier.)
Like a miracle, the load average stays below 1 now, nor do the disks
saturate! All's well and webmail performs snappily even during peak
usage!
Granted we also upgraded freebsd which supposedly has made leaps to
better schedule tasks across multiple processors, but this is not a
many core machine - just 2.
We almost don't need to scale up our architecture, tho will, thanks so
very much for this wonderful software Timo!
On Wed, 2009-12-09 at 12:43 -0800, Benjamin Connelly wrote:
Like a miracle, the load average stays below 1 now, nor do the disks
saturate! All's well and webmail performs snappily even during peak
usage!
If you still want to reduce disk IO, you can also try (if you haven't already):
- deliver: http://wiki.dovecot.org/LDA/Indexing
- maildir_very_dirty_syncs=yes: http://wiki.dovecot.org/MailLocation/Maildir
If you still want to reduce disk IO, you can also try (if you haven't already):
- deliver: http://wiki.dovecot.org/LDA/Indexing
- maildir_very_dirty_syncs=yes: http://wiki.dovecot.org/MailLocation/Maildir
We did switch from maildrop (and courier authd and cyrus sasl) to just
dovecot (+sieve +manage sieve, a great simplification to build and
maintain) and will ponder very dirty syncs. . .
Let me make myself clearer.
Dovecot opens a postgres connection, Dovecot sends a SELECT query using the connection, The response can be
a) One or more rows of data b) Zero rows of data (eg if the username doesn't exist) c) An error response (eg if the query contains a syntax error)
If a) or b) happens, then dovecot accepts or denies the user, and closes the connection. If c) happens, dovecot denies the user, but leaves the connection open (stuck). Next time someone tries to log on dovecot opens a new connection. This leads to a gradual buildup of connections until the limit is reached and stuff breaks.
In my situation, the query was sometimes failing because the data sent was the wrong type (a string, when it should have been an integer - my fault).
Hope that makes sense now. I think dovecot needs a line of code inserted to check whether the query failed, and if it did, log the error, and close the connection gracefully.
Cheers, Dan
On Wed, 09 Dec 2009 15:24:45 -0500, Timo Sirainen tss@iki.fi wrote:
On Sun, 2009-11-29 at 06:00 +0000, Daniel Howard wrote:
In these cases, Dovecot would still fall back to PAM authentication, and the user would log in, but the postgres connection would remain stuck. Gradually the number of connections would increase until postgres failed.
This sounds weird. You mean if password_query doesn't return any results Dovecot somehow completely hangs/leaks a PostgreSQL connection and creates a new one? Did it log any errors?
On Thu, 2009-12-10 at 18:06 +0000, Daniel Howard wrote:
Let me make myself clearer.
Dovecot opens a postgres connection, Dovecot sends a SELECT query using the connection, The response can be
a) One or more rows of data b) Zero rows of data (eg if the username doesn't exist) c) An error response (eg if the query contains a syntax error)
If a) or b) happens, then dovecot accepts or denies the user, and closes the connection.
Except it doesn't, or at least it shouldn't.. It should keep using the same connection for all logins.
If c) happens, dovecot denies the user, but leaves the connection open (stuck). Next time someone tries to log on dovecot opens a new connection. This leads to a gradual buildup of connections until the limit is reached and stuff breaks.
I tested that this doesn't happen with v1.2. It may have bee a bug in v1.0.
participants (3)
-
Benjamin Connelly
-
Daniel Howard
-
Timo Sirainen