auth-worker goes wild

Hans Morten Kind Kind at adm.uib.no
Tue Oct 14 16:56:12 UTC 2014


dovecot-2.2.13 running on Solaris-10, app 5000 concurrent connections,
and a pretty standard config where passdb is set up with driver = pam.

We have had some situations where an auth-worker gets into a loop, the load on
the server rises and the auth-master seems blocked untill finally no new imap nor
pop session are accepted. A 'doveadm reload' restores normal operations.

With debug logging enabled we are observing this:
  auth: Debug: pam(,10.10.30.33,<7WMu3UkEGQCBsR4h>): cache miss
  auth-worker(24556): Debug: pam(,10.10.30.33): lookup service=dovecot
  auth-worker(24556): Debug: pam(,10.10.30.33): #1/1 style=2 msg=Please enter user name:
  ...
  auth-worker(24556): Debug: pam(,10.10.30.33): #1/1 style=2 msg=Please enter user name:
  etc etc

After some time dovecot only logs
  Warning: Auth process not responding, delayed sending initial response (greeting)
  Disconnected: Auth process broken (disconnected before auth was ready, waited 26 secs)

'truss -wall -aep PID' on the looping auth-worker shows the loop:
   0.0757 write(2, 0x080AE0C0, 71)                        = 71
    0101 2 4 5 5 6   p a m ( , 1 0 . 1 0 . 3 0 . 3 3 ) :   # 1 /
     1   s t y l e = 2   m s g = P l e a s e   e n t e r   u s e r  
     n a m e :  \n
   0.0826 write(2, 0x080AE0C0, 71)                        = 71
    0101 2 4 5 5 6   p a m ( , 1 0 . 1 0 . 3 0 . 3 3 ) :   # 1 /
     1   s t y l e = 2   m s g = P l e a s e   e n t e r   u s e r  
     n a m e :  \n

The host at 10.10.30.33 is running RoundCube for app 1000 concurrent users.
This problem started after we provided our users RoundCube, and every time
we end up in this stuck situation, it is the RoundCube-server making the
connections where the username has zero length. I am unable to reproduce
how RoundCube puts dovecot into this mess.

With the enclosed patch installed, the loop terminates and normal operation
are restored immediately:
  auth: Debug: pam(,10.10.30.33,<Q8kV6AEFYwCBsR4h>): cache miss
  auth-worker(18658): Debug: pam(,10.10.30.33): lookup service=dovecot
  auth-worker(18658): Debug: pam(,10.10.30.33): #1/1 style=2 msg=Please enter user name:
  auth-worker(18658): Fatal: NO USER?
  auth-worker(18658): Error: Raw backtrace: 0xfef0798c -> 0xfef06d3b -> 0x807ca6e -> 0xfecc460e -> 0xfecc1d1e -> 0xfe3c0af4 -> 0xfecc2608 -> 0xfecc27f1 -> 0x807cc5e -> 0x807d162 -> 0x807d310 -> 0x8069ca1 -> 0x806ab72 -> 0x806aeec -> 0xfef1e866 -> 0xfef2001c -> 0xfef1e9ff -> 0xfef1e956 -> 0xfeeb0172 -> 0x8071bf2 -> 0x805a650
  auth: Error: auth worker: Aborted PASSV request for : Worker process died unexpectedly



I think however, that the pam driver should not have been initiated when the username
has zero length, definitely there is nobody around to answer the prompt for a username ...

hmk

-------------- next part --------------
Common subdirectories: orig/auth and src/auth
diff -ru orig/auth/passdb-pam.c src/auth/passdb-pam.c
--- orig/auth/passdb-pam.c	ti. okt. 14 12:04:16 2014
+++ src/auth/passdb-pam.c	ti. okt. 14 12:02:28 2014
@@ -85,6 +85,8 @@
 			string = strdup(ctx->request->user);
 			if (string == NULL)
 				i_fatal_status(FATAL_OUTOFMEM, "Out of memory");
+			if (strlen(string) == 0)
+				i_fatal_status(FATAL_OUTOFMEM, "NO USER?");
 			break;
 		case PAM_PROMPT_ECHO_OFF:
 			/* Assume we're asking for password */


More information about the dovecot mailing list