[Dovecot] Outlook causes BIG dovecot index problem (beta 8, outlook 2003)
Hey all,
Well, I typed up all the below, then managed to fix the problem before sending this to the list. I'll send it anyways, in case anybody else is affected by it (and because it does indicate a definite bug in Dovecot's experimental Kerberos support). The fix is to recompile without kerberos enabled, and to take gssapi out of the auth methods in the config (the latter alone may or may not be enough of a fix on it's own, I didn't try it without recompiling).
I'm guessing that Outlook attempts to use Kerberos authentication by
default or something and silently fails and uses other auth methods, and
that any attempt to use Kerberos is in fact what hoses the server.
However I have not done any further testing against beta8 with a
Kerberos-enabled client such as KMail, so I leave this as an exercise
for the next adventurer.
I'm also uncertain as to why this problem does not manifest on our main
servers, which also have Kerberos enabled (and Outlook 2003 clients).
Or /had/ anyways, as I'm disabling Kerberos support as we speak as a
preventative measure.
Here's the problem description:
I've been running dovecot happily for many months on our production mail servers hosting about 600 accounts, with no problems. People use every mail client under the sun - Apple Mail, Outlook 97|2000|XP|2003, Outlook Express, Mutt, Thunderbird, Mozilla Mail, Eudora, and other esoteric things with no problem.
So today I installed a new mail server on a client site dedicated to their domain. The MTA is exim, running with almost the same configuration we use on our main servers. The dovecot.conf is 100% identical. No mail was moved by hand to the new mail server - to migrate we copied the mail setting up the old and new accounts in Thunderbird and dragging things over (just to rule out that possibility, we have also tested using brand new empty accounts to reproduce the issue).
Everything is golden using Thunderbird - email is send, and shows up in the inbox immediately.
Enter Outlook 2003 (SP1): It shows all folders and mail fine. It sends mail fine. Nothing shows up in the inbox. At this moment, Thunderbird also stops receiving new mail in the inbox. I check the server, and there's a mail sitting in the /new directory in the maildir. Not only that, but suddenly NONE of the accounts on the box show any new mail, exhibiting the same signs.
I tried a variety of things, including 'sync', removing all the dovecot* index files in the maildir, and restarting dovecot (being sure to killall imap before bringing it back up). Sometimes, these steps would magically cause everything to work again, until I did a mail poll in Outlook which causes the problem to re-occur. Other times, these steps didn't seem to do anything. A couple times, for no apparent reason, without doing anything, mail would show up. Too hopeful, we sat around for 30 minutes sending more mail to an account and then just idling...nothing showed up.
The hardware is good, we've compiled a full glibc and gcc just to ensure that there are no CPU/memory issues, and this box has a solid good history behind it in it's prior uses.
Until Outlook is used, everything works fine. We do have the outlook-idle workaround enabled, and tried both with and without delay-newmail, which didn't make any difference.
Cheers,
Casey Allen Shobe | cshobe@seattleserver.com | 206-381-2800 SeattleServer.com, Inc. | http://www.seattleserver.com
Hi dovecot 0.99 (from fc4) /etc/dovecot.conf client_workarounds = outlook-pop3-no-nuls
request USER <Username> response +OK \r \n \0x00
It seems that after getting zero byte OE 6.00.2600.2100 doesn't want to send password request PASS \r \n
and of course authentication fails.
What is a reason to send zero byte in pop protocol at all?
--Nick Zhokhov
On Sun, 2006-05-28 at 23:22 +0400, Nick Zhokhov wrote:
Hi dovecot 0.99 (from fc4) /etc/dovecot.conf client_workarounds = outlook-pop3-no-nuls
This workaround just means Dovecot changes NULs to other characters if they occur in mails.
request USER <Username> response +OK \r \n \0x00
This shouldn't happen. Are you sure it's Dovecot and not eg. some proxy in the middle sending this?
If it's really Dovecot's problem, I'd suggest trying with 1.0betas instead, 0.99.x bugs won't be fixed anymore.
Casey Allen Shobe wrote:
Well, I typed up all the below, then managed to fix the problem before sending this to the list. I'll send it anyways, in case anybody else is affected by it (and because it does indicate a definite bug in Dovecot's experimental Kerberos support). The fix is to recompile without kerberos enabled, and to take gssapi out of the auth methods in the config (the latter alone may or may not be enough of a fix on it's own, I didn't try it without recompiling).
I'm guessing that Outlook attempts to use Kerberos authentication by default or something and silently fails and uses other auth methods, and that any attempt to use Kerberos is in fact what hoses the server. However I have not done any further testing against beta8 with a Kerberos-enabled client such as KMail, so I leave this as an exercise for the next adventurer.
I'm also uncertain as to why this problem does not manifest on our main servers, which also have Kerberos enabled (and Outlook 2003 clients). Or /had/ anyways, as I'm disabling Kerberos support as we speak as a preventative measure.
Kerberos actually apparently didn't have anything to do with it, so I'm a bit mystified as to why taking Kerberos support out appeared to fix the problem for about 16 hours before the same thing started happening again.
The real problem is that there's a JFS bug in linux kernels prior to 2.6.15 where the directory timestamps don't get updated when a link is created within, and apparently that makes Dovecot unable to know when to update it's indexes.
Updating to 2.6.16 seems to have fixed things for once and for all.
-- Casey Allen Shobe
participants (3)
-
Casey Allen Shobe
-
Nick Zhokhov
-
Timo Sirainen