[Dovecot] Can't connect to auth server at default: Connection refused

Chris Wakelin c.d.wakelin at reading.ac.uk
Fri Sep 30 22:55:44 EEST 2005


Thanks for the hints.

I've now moved userdb from "passwd" to "passwd-file" pointing to a 
munged file created overnight from our NIS password file. (We don't need 
the actual passwords, of course, as we're authenticating via pam and 
pam_ldap to Active Directory.)

Dovecot now uses no NIS, and apparently hashes/caches the passwd-file in 
memory making it just as quick as using userdb=static, but with the 
advantage that processes run as the user (and I don't need to chgrp 
everything). I can even tweak the mail environment per user so have some 
switched to Maildir, for example!

I managed 2600+ logins per minute benchmarking using "rabid" with empty 
mailboxes and 19 test accounts.

I'm still occassionally seeing "Login process has too old requests" but 
they're not causing a problem.

I've also turned on "mbox_very_dirty_syncs" which seems to have reduced 
the load further (half the CPU, 1/3 again of the characters read, 2/3 of 
the disk blocks accessed).

The biggest test starts on Monday when term officially starts and 
everybody (staff and students) is at work, but it's looking really good!

Best Wishes,
Chris

Timo Sirainen wrote:
> On Tue, 2005-09-27 at 17:30 +0100, Chris Wakelin wrote:
> 
>>We've been getting more authentication problems today. This lunchtime I
>>put in a version of 1.0-stable, including Timo's fix below, which may
>>have helped, but still we've had, e.g:
>>
>>dovecot: Sep 27 16:04:27 Warning: auth(default): Login process has too
>>old (126s) requests, killing it.
>>dovecot: Sep 27 16:04:27 Error: auth(default): file mech.c: line 117
>>(auth_request_destroy): assertion failed: (request->refcount > 0)
>>dovecot: Sep 27 16:04:27 Error: child 21726 (auth) killed with signal 6
>>dovecot: Sep 27 16:04:27 Error: imap-login: Can't connect to auth server
>>at default21726: Connection refused
> 
> 
> dovecot-auth crashes and gets restarted, that's why these connection
> errors happen.
> 
> The crashing most likely happens because passdb (or maybe userdb?)
> lookup hangs long enough to cause Dovecot timeout the results, and the
> code in 1.0-stable doesn't handle that well.
> 
> I looked into these crashes last weekend but looks like they don't exist
> in 1.0-alphas anymore so I didn't do anything about them to 1.0-stable
> either.. Anyway, the overly long lookup times are the real problem
> you're having.
> 


-- 
--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+-
Christopher Wakelin,                           c.d.wakelin at reading.ac.uk
IT Services Centre, The University of Reading,  Tel: +44 (0)118 378 8439
Whiteknights, Reading, RG6 2AF, UK              Fax: +44 (0)118 975 3094


More information about the dovecot mailing list