[Dovecot] Connection leak in sqlpool for ACL Dict Postgres in v2.1.15 (14907:b96df105ec55) and v2.2.beta1 (15857:07dfd4391d22)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hi,
I'm using a postgres dict for ACL, when when use deleteacl or setacl I get a "not connected" error and (v2.2) Panic: file driver-pgsql.c: line 84 (driver_pgsql_set_state): assertion failed: (state == SQL_DB_STATE_BUSY || db->cur_result == NULL) and a rawdump eventually.
You can trigger the error with the "dict" socket easily:
while true; do ( echo -en "H2\t0\t0\t\tacl\nI1\tshared/shared-boxes/\n" sleep 1 ) | socat - unix:/var/run/dovecot2.2/dict done
I interprete H2-I1..-bang.log.gz (logged with v2.1) so, that each time the ACL dict is iterated, driver_sqlpool_query_s() is executed two times, once for user_shares and once for anyone_shares. I guess that the user_shares one is never finished, so that the sqlpool keeps that slot in "busy" state until timeout.
The information about the slots of the log lines
0x9886d0 1. state=3 to_recon=(nil) ready=0 |driver_sqlpool_query_s end
is generated by:
conns = array_get(&db->all_connections, &count);
for (i = 0; i < count; i++) {
struct sql_db *conndb = conns[i].db;
i_info("SKA: POOL: [%u] %p %u. state=%u to_recon=%p ready=%u |%s", getpid(), db, i, conndb->state, conndb->to_reconnect, SQL_DB_IS_READY(conndb)? 1: 0, msg);
}
msg is the function, where that log function is called from, e.g. at the end of driver_sqlpool_query_s() in this case, which is right before the return.
When the user_shares query ends in driver_sqlpool_query_s(), one additional slot of the sqlpool is in state 3, at the end of driver_sqlpool_query_s() of the anyone_shares query there are two slots in state 3. When the next I1 command triggers the next iteration, there is one slot still in state 3 at the start of driver_sqlpool_query_s(), which seems to be the slot used for the user_shares query of the previous I1 command.
It looks like that the user_shares query has not been "finished" correctly and is left open. The assert() panics, because db->cur_result is != NULL.
When all slots in the pool have been used up, the sqlpool enters "not connected" state and all dict connections fail with this error now. Then the assert causes the dict to die eventually, I guess, because of a timeout.
The quota dict seems to not trigger that behaviour, because it uses the "L" (lookup) command, but when the sqlpool is in "not connected" state, these commands fail, too.
Attached you'll find the config and log file.
Kind regards,
Steffen Kaiser -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux)
iQEVAwUBUSTYgF3r2wJMiz2NAQLCGggAqcAThH8/phYIGxnkjEt3ECGBabCvDO72 L28rjvaakFffy5jmhrLrcV2b765W9XsZ/bhNOhfkQADAf8oCU9Z5q4h7wG+ZnoMj 4sd1Zax7wU/8idmrF1esSYAfZ1kCVq/ySHY7zs8stoWcpiUk6uQMN4ynYS7c3kxJ yaOWz31s0r4pS8muktq9qCwjlCXDjTigIiPbw9X5qpJ8hnvJq9Xv/BfW5uf1xKKK BXNB+HrQYrOA6VD3KCjteZhaquQFd18B9b35WkBH77thwSbaN1/R4zy8CLLpZOkS 5ss5K/doPitCunGRcNiVD8srgUU6Po7Zv0akBmE1XUfBKkn6JUILDQ== =VpqF -----END PGP SIGNATURE-----
= Mailscanner Information [FB02]
== Rename Filename At least one attachement has been renamed to overcome a potential filename attack. Right-click on the item and select SaveAs, in order to save the file with its original name. Mindestens ein Anhang wurde umbenannt, um eine potentielle Attacke mit unleserlichen Dateinamen zu verhinden. Um ihn mit seinem orginialen Namen zu speichern, klicken Sie rechts auf den Anhang und wählen SaveAs. See FAQ [http://www5.inf.fh-bonn-rhein-sieg.de/FAQ/mail/scan/] for more information.
NewName |OldName ------------------+------------------- H2-I1.-bang.log.gz|H2-I1..-bang.log.gz
-- This information was created by the mailscanner automatically. No human interferred with the process.
On 20.2.2013, at 16.06, Steffen Kaiser <skdovecot@smail.inf.fh-brs.de> wrote:
I'm using a postgres dict for ACL, when when use deleteacl or setacl I get a "not connected" error and (v2.2) Panic: file driver-pgsql.c: line 84 (driver_pgsql_set_state): assertion failed: (state == SQL_DB_STATE_BUSY || db->cur_result == NULL) and a rawdump eventually.
You can trigger the error with the "dict" socket easily:
while true; do ( echo -en "H2\t0\t0\t\tacl\nI1\tshared/shared-boxes/\n" sleep 1 ) | socat - unix:/var/run/dovecot2.2/dict done
Thanks for the easy test :) I've heard of these crashes for a quite a long time but haven't really figured out how to reproduce them. Also I always thought the bug was in driver-pgsql code, while it was really in dict-sql code. Fixed: http://hg.dovecot.org/dovecot-2.1/rev/0e0fd4b5a582
Although of course the driver-pgsql code is also quite ugly and complex and I wouldn't be surprised if there were some bugs in its error handling code paths. But I guess it can be left for later..
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On Fri, 22 Feb 2013, Timo Sirainen wrote:
Thanks for the easy test :) I've heard of these crashes for a quite a long time but haven't really figured out how to reproduce them. Also I always thought the bug was in driver-pgsql code, while it was really in dict-sql code. Fixed: http://hg.dovecot.org/dovecot-2.1/rev/0e0fd4b5a582
Many thanks, it works now :-) Please, don't forget to add the patch to v2.2. I didn't found it in its "hg incoming".
Kind regards,
Steffen Kaiser -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux)
iQEVAwUBUSdNHV3r2wJMiz2NAQKd0AgAhucI89hun6yIzQYtom0u428106/rxPGJ 6Y+/K/tLbzGOr0Hm7g0AjMNU8YDIhOeAKZiXJ5Dt2Lm71425jWUlvWpKZnUfOehH RoIzTLCKMXs/3HIdqIoEVNHnpV4JSWKMqsJIvMFLvewRhHM/MiLo2WUaQo3DZDSB ggIq8zKovbliZi50ZTdXnnKTbDrKUaD66jj8CGJKJj17zu6jlLyIkAbJu62qsNcf Np4P21ffgTurjM5LMFVpIlhshhtB7YoZ3WOEjrzAC5yKc0ifbN9aN0dRVl2eraRo fHwud22+wi5lfvQR0ANZ5eNyahdZEHdCaeaKK34ZgM+SllIu5mdfGg== =6NNC -----END PGP SIGNATURE-----
On 22.2.2013, at 12.49, Steffen Kaiser <skdovecot@smail.inf.fh-brs.de> wrote:
On Fri, 22 Feb 2013, Timo Sirainen wrote:
Thanks for the easy test :) I've heard of these crashes for a quite a long time but haven't really figured out how to reproduce them. Also I always thought the bug was in driver-pgsql code, while it was really in dict-sql code. Fixed: http://hg.dovecot.org/dovecot-2.1/rev/0e0fd4b5a582
Many thanks, it works now :-) Please, don't forget to add the patch to v2.2. I didn't found it in its "hg incoming".
I'm nowadays just once in a while pulling all changes from v2.1 tree to v2.2 tree and merging them. Not sure if that's the best way to do it, but works at least and there's no way of accidentally missing any changes.
participants (2)
-
Steffen Kaiser
-
Timo Sirainen