[Dovecot] Errors in dovecot 1 beta7
Hi, I just installed dovecot 1.0 beta 7 and I have these errors:
May 5 09:21:27 mail dovecot: pop3-login: Login: user=<user1>, method=PLAIN, rip=::ffff:xx.xx.xx.6, lip=::ffff:xx.xx.xx.1 May 5 09:21:27 mail dovecot: POP3(user2): Disconnected: Logged out top=0/0, retr=0/0, del=0/0, size=0 May 5 09:21:27 mail dovecot: pop3-login: Login: user=<user3>, method=PLAIN, rip=::ffff:xx.xx.xx.12, lip=::ffff:xx.xx.xx.1 May 5 09:21:27 mail dovecot: auth(default): userdb(user4,::ffff:xx.xx.xx.207): user not found from userdb May 5 09:21:27 mail dovecot: pop3-login: Internal login failure: user=<user4>, method=PLAIN, rip=::ffff:xx.xx.xx.207, lip=::ffff:xx.xx.xx.1 May 5 09:21:27 mail dovecot: POP3(user1): Disconnected: Logged out top=0/0, retr=0/0, del=0/0, size=0
May 5 09:36:49 mail dovecot: pop3-login: Login: user=<user5>, method=PLAIN, rip=::ffff:xx.xx.xx.203, lip=::ffff:xx.xx.xx.1 May 5 09:36:49 mail dovecot: POP3(user5): Disconnected: Logged out top=0/0, retr=0/0, del=0/0, size=0 May 5 09:36:50 mail dovecot: auth(default): BROKEN NSS IMPLEMENTATION: getpwnam() lookup returned different user than was requested (user6 != user7). May 5 09:36:50 mail dovecot: child 8454 (auth) returned error 89 May 5 09:37:50 mail dovecot: pop3-login: Disconnected: Inactivity: user=<user7>, method=PLAIN, rip=::ffff:xx.xx.xx.32, lip=::ffff:xx.xx.xx.1
I run dovecot in CentOS 4.3 using openldap/nss_ldap/nscd
The problem get resolved when I restart dovecot. Is this a problem in dovecot or in nss_ldap/nscd? How can I debug even further?
Many thanks Oliver
-- Oliver Schulze L. oliver@samera.com.py
On Fri, 2006-05-05 at 09:48 -0400, Oliver Schulze L. wrote:
May 5 09:36:50 mail dovecot: auth(default): BROKEN NSS IMPLEMENTATION: getpwnam() lookup returned different user than was requested (user6 != user7). .. The problem get resolved when I restart dovecot. Is this a problem in dovecot or in nss_ldap/nscd?
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=154314
I'd recommend not using nss_ldap but using Dovecot's LDAP support directly.
Hi Timo, will do
Many Thanks Oliver
Timo Sirainen wrote:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=154314
I'd recommend not using nss_ldap but using Dovecot's LDAP support directly.
-- Oliver Schulze L. oliver@samera.com.py
Hi Timo, I have been reading that bug comments and it seems stalled.
Also, I have been doing a stress test on the server. I call getpwnam from php for 10 users, 10000 times each user. I pause 200ms between getpwnam() calls. I can make the server break again :( I tryied turning nscd on and off with no luck. nss_ldap seems to work now, meanwhile I will keep watching the logs for a week and see what happend.
Also, I posted the bug in the nss_ldap list to see if someone knows about the issue. In CentOS 4.3, the nss_ldap version is 226, the latest is 250. So, upgrading is an option to me if the problem occurs again. Right now, 1beta7 is rock solid!
Thanks Oliver
Timo Sirainen wrote:
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=154314
I'd recommend not using nss_ldap but using Dovecot's LDAP support directly.
-- Oliver Schulze L. oliver@samera.com.py
Hi Timo, about bug #: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=154315
That patch adds a warning message, but it does not handle the error. The problem I have seen is that once a "BROKEN NSS IMPLEMENTATION" is detected, dovecot can not handle any login anymore. So, the only solution is to manually restart dovecot.
It would be nice to have dovecot handle the error and return an error or something, so that other users still can login. The affected user should see an error like: "try again later" or :bad password".
I think this will help for old RedHat installations or to have a solution while a patch for nss_ldap is found.
Many Thanks Oliver
-- Oliver Schulze L. oliver@samera.com.py
On Sat, 2006-05-06 at 10:49 -0400, Oliver Schulze L. wrote:
Hi Timo, about bug #: https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=154315
That patch adds a warning message, but it does not handle the error. The problem I have seen is that once a "BROKEN NSS IMPLEMENTATION" is detected, dovecot can not handle any login anymore. So, the only solution is to manually restart dovecot.
I don't really see how that can happen. When Dovecot gives that error message, it kills the dovecot-auth process completely and it gets restarted.
If you manually restart Dovecot it does pretty much the same thing except it also restarts dovecot master process, but I don't see how that would affect this at all.
It would be nice to have dovecot handle the error and return an error or something, so that other users still can login. The affected user should see an error like: "try again later" or :bad password".
If restarting the whole dovecot-auth process doesn't fix it, then I don't see how it could even be fixed any better.
I guess one possibility would be to just try the getpwnam() again if the first time returns a failure, but I don't know if that works.
In any case I don't really even want to try to work around this. Better would be to just get it really fixed where the bug is, or just not use nss_ldap at all.
Timo Sirainen wrote:
I don't really see how that can happen. When Dovecot gives that error message, it kills the dovecot-auth process completely and it gets restarted.
It happened only once, I got once the error: "BROKEN NSS IMPLEMENTATION" and later, no login was accepted, I got this error: dovecot: pop3-login: Can't connect to auth server at default: Resource temporarily unavailable
Looks like its nss_ldap fault too. Or maybe nscd stopped working. Will debug a litle more.
I agree that nss_ldap should be fixed, meanwhile, I will let dovecot running with nss_ldap for one more week to get debug messages and later change to ldap.
Many Thanks Oliver
-- Oliver Schulze L. oliver@samera.com.py
I see these errors talso: May 6 09:02:10 mail dovecot: pop3-login: Disconnected: Inactivity: rip=::ffff:xx.xx.xx.13, lip=::ffff:xx.xx.xx.1 May 6 09:02:10 mail dovecot: auth(default): Master request 21566.1 not found May 6 09:02:10 mail dovecot: pop3-login: Disconnected: Inactivity: user=<user2>, method=PLAIN, rip=::ffff:xx.xx.xx.12, lip=::ffff:xx.xx.xx.1 May 6 09:02:10 mail dovecot: pop3-login: Master sent reply with unknown tag 1 May 6 09:02:10 mail dovecot: child 21566 (login) returned error 89 May 6 09:02:13 mail dovecot: pop3-login: Login: user=<user1>, method=PLAIN, rip=::ffff:xx.xx.xx.10, lip=::ffff:xx.xx.xx.1
What does "Master request 21566.1 not found" means?
Many thanks Oliver
-- Oliver Schulze L. oliver@samera.com.py
On Sat, 2006-05-06 at 10:17 -0400, Oliver Schulze L. wrote:
I see these errors talso: May 6 09:02:10 mail dovecot: pop3-login: Disconnected: Inactivity: rip=::ffff:xx.xx.xx.13, lip=::ffff:xx.xx.xx.1 May 6 09:02:10 mail dovecot: auth(default): Master request 21566.1 not found May 6 09:02:10 mail dovecot: pop3-login: Disconnected: Inactivity: user=<user2>, method=PLAIN, rip=::ffff:xx.xx.xx.12, lip=::ffff:xx.xx.xx.1 May 6 09:02:10 mail dovecot: pop3-login: Master sent reply with unknown tag 1 May 6 09:02:10 mail dovecot: child 21566 (login) returned error 89 May 6 09:02:13 mail dovecot: pop3-login: Login: user=<user1>, method=PLAIN, rip=::ffff:xx.xx.xx.10, lip=::ffff:xx.xx.xx.1
What does "Master request 21566.1 not found" means?
From my TODO:
- imaptest: imap-login: Master sent reply with unknown tag 1
- client closed connection at the exact same time master was logging it in? master_request_abort()
Haven't bothered to fix that yet, since if it happens only at disconnect it doesn't break anything.
participants (2)
-
Oliver Schulze L.
-
Timo Sirainen