On Fri, 2007-02-09 at 10:38 +0000, David Lee wrote:
On the whole we are pleased with our trials of dovecot to replace UW-IMAP.
But (ah!) we have hit one particular problem, in which we think dovecot could probably benefit from a resilience improvement.
We're running dovecot on Fedora Core 5 (FC5), with passwd map details supplied by NIS. We have found that "nscd" sometimes thinks that a username is invalid, even though it is valid. So when "deliver" attempts a delivery to the INBOX of that username, it receives "user unknown" from the name service, and then does a 5xx permanent failure of valid email.
From the user perspective "The System" has incorrectly rejected perfectly valid incoming email. It is rare, but it does occasionally happen on large, busy systems.
Clearly it is fundamentally an "nscd" bug. But that bug is nevertheless out there, in the wild, on such systems, potentially affecting dovecot's delivery of valid user email.
We have had a source code hack since October (in "deliver.c", simply replacing a "return ret" occurence with "return EX_TEMPFAIL") and it has worked nicely (ported forward from rc8 towards rc22). Mail re-queues and a later delivery attempt then succeeds.
So it would be both helpful, and good for resilience against such real OS/nscd bugs (and similar), if there were a configuration option in dovecot to allow a site admin to tell deliver to use a temporary, 4xx, failure instead (if the circumstances were appropriate for the site).
Having been hit by numerous problems with nscd as well with many applications I'll just throw that in:
- nscd is to be prevented whenever possible
- (if) nscd is broken, complain with the vendor or better
- fix bugs at the right place
A few excerpts from a discussion about nscd on the postfix ML some time ago about exactly the same problem (postfix not finding reicipients due to nscd delivering bad information):
"nscd is a crappy piece of software that is unstable and frequently corrupts information."
"Most of the work is identifying the right problem. Much effort goes to waste solving the wrong one."
Just my 2¢ ...
-- Udo Rader
bestsolution.at EDV Systemhaus GmbH http://www.bestsolution.at