i believe i tracked it down to a couple lines in db_ldap_bind and fixed it - dovecot-auth is reconnecting to ldap in the condition where it was not previously:
timo, i think i found another spot where it won't reconnect. when ldap_conn_reconnect was getting called, it wasn't completely reconnecting, and the requests in conn->delayed_requests_tail would never be processed. when i changed the code to force a connection close at the start of ldap_conn_reconnect then it would reconnect successfully. this does cause auth failures when ldap is unconnected (which from my limited understanding of the code appears to not be the original desire), but it does cause the system to recover gracefully. you might be able to come up with a better way to handle this (my c is weak). here's a patch that incorporates that one small change and the previous one as well: --- dovecot-1.0.3/src/auth/db-ldap.c.orig 2007-12-19 22:01:46.622328000 +0000 +++ dovecot-1.0.3/src/auth/db-ldap.c 2007-12-19 22:03:08.145721000 +0000 @@ -294,7 +294,7 @@ static void ldap_conn_reconnect(struct ldap_connection *conn) { - ldap_conn_close(conn, FALSE); + ldap_conn_close(conn, TRUE); if (db_ldap_connect(conn) < 0) { /* failed to reconnect. fail all requests. */ @@ -446,7 +446,10 @@ msgid = ldap_bind(conn->ld, conn->set.dn, conn->set.dnpass, LDAP_AUTH_SIMPLE); if (msgid == -1) { - db_ldap_connect_finish(conn, ldap_get_errno(conn)); + if (db_ldap_connect_finish(conn, ldap_get_errno(conn)) < 0) { + /* lost connection, close it */ + ldap_conn_close(conn, TRUE); + } i_free(ldap_request); return -1; }