[Dovecot] auth-ldap not resetting connection state after failed bind
timo, i think i found a small problem with the ldap implementation: when using auth_bind (this might be in all conditions, not just that one - i haven't tested further), if the server lost connection to the ldap server (ie, the ldap server died) dovecot-auth would never reconnect to the ldap server and all subsequent auth attempts would fail. after a little more digging, i discovered that if the ldap server went down and came back up before the next dovecot request then everything would be fine. however, if a request came in while the ldap server was down then dovecot-auth would "cache" that the server was unavailable and never recheck it. i believe i tracked it down to a couple lines in db_ldap_bind and fixed it - dovecot-auth is reconnecting to ldap in the condition where it was not previously: --- dovecot-1.0.3/src/auth/db-ldap.c 2007-10-15 18:26:55.983349000 +0000 +++ dovecot-1.0.3/src/auth/db-ldap.c.new 2007-10-15 18:28:03.124136000 +0000 @@ -446,7 +446,10 @@ msgid = ldap_bind(conn->ld, conn->set.dn, conn->set.dnpass, LDAP_AUTH_SIMPLE); if (msgid == -1) { - db_ldap_connect_finish(conn, ldap_get_errno(conn)); + if (db_ldap_connect_finish(conn, ldap_get_errno(conn)) < 0) { + /* lost connection, close it */ + ldap_conn_close(conn, TRUE); + } i_free(ldap_request); return -1; }
On Mon, 2007-10-15 at 15:32 -0400, Brendan Braybrook wrote:
i believe i tracked it down to a couple lines in db_ldap_bind and fixed it - dovecot-auth is reconnecting to ldap in the condition where it was not previously:
i believe i tracked it down to a couple lines in db_ldap_bind and fixed it - dovecot-auth is reconnecting to ldap in the condition where it was not previously:
timo, i think i found another spot where it won't reconnect. when ldap_conn_reconnect was getting called, it wasn't completely reconnecting, and the requests in conn->delayed_requests_tail would never be processed. when i changed the code to force a connection close at the start of ldap_conn_reconnect then it would reconnect successfully. this does cause auth failures when ldap is unconnected (which from my limited understanding of the code appears to not be the original desire), but it does cause the system to recover gracefully. you might be able to come up with a better way to handle this (my c is weak). here's a patch that incorporates that one small change and the previous one as well: --- dovecot-1.0.3/src/auth/db-ldap.c.orig 2007-12-19 22:01:46.622328000 +0000 +++ dovecot-1.0.3/src/auth/db-ldap.c 2007-12-19 22:03:08.145721000 +0000 @@ -294,7 +294,7 @@ static void ldap_conn_reconnect(struct ldap_connection *conn) { - ldap_conn_close(conn, FALSE); + ldap_conn_close(conn, TRUE); if (db_ldap_connect(conn) < 0) { /* failed to reconnect. fail all requests. */ @@ -446,7 +446,10 @@ msgid = ldap_bind(conn->ld, conn->set.dn, conn->set.dnpass, LDAP_AUTH_SIMPLE); if (msgid == -1) { - db_ldap_connect_finish(conn, ldap_get_errno(conn)); + if (db_ldap_connect_finish(conn, ldap_get_errno(conn)) < 0) { + /* lost connection, close it */ + ldap_conn_close(conn, TRUE); + } i_free(ldap_request); return -1; }
On Thu, 2007-12-20 at 13:35 -0500, Brendan wrote:
i believe i tracked it down to a couple lines in db_ldap_bind and fixed it - dovecot-auth is reconnecting to ldap in the condition where it was not previously:
timo, i think i found another spot where it won't reconnect.
when ldap_conn_reconnect was getting called, it wasn't completely reconnecting, and the requests in conn->delayed_requests_tail would never be processed.
I noticed that dovecot-auth went into infinite loop. Fixed v1.0 the same way you did: http://hg.dovecot.org/dovecot-1.0/rev/1a87f8495e07
And rewrote the queuing code for v1.1: http://hg.dovecot.org/dovecot/rev/0dcea80312b0
participants (3)
-
Brendan
-
Brendan Braybrook
-
Timo Sirainen