[Dovecot] critical X-UID reordering problem after upgrade from 1.1 to 1.2
We upgraded our imap server hardware and went from dovecot 1.1.3 to 1.2.3 in the process and have been having a critical problem since. I believe I've tracked it down to X-UID reordering that dovecot seemed to do to the mbox format (I know, I know...) inboxes. The problem has manifest itself in several ways with different mail clients but the general problem is that the mail clients have gotten confused about which message is which. For example, thunderbird will display the wrong information about a message in the header listing but then show the correct data in the message display window when you actually open the message. As you can imagine, this is causing all kinds of user problems.
Here is a very simplified explanation of what I'm seeing. Imagine that the inbox had 5 messages with X-UID values 1, 3, 5, 6, and 9 before the upgrade. After the upgrade the X-UIDs were reordered so these same 5 messages now might have X-UID values of 1, 2, 3, 4, and 5. Thunderbird then shows the summary information like this:
Subject/Sender/Date for message 1
Subject/Sender/Date for message 2
Subject/Sender/Date for message 2
Subject/Sender/Date for message 4
Subject/Sender/Date for message 3
Now, if you actually open the 5th message it shows the right message body but the summary information is for the message that *used* to have X-UID 5 (which is now the 3rd message). So, the thunderbird cache is out of sync and I can fix it by removing the .msf file for that mailbox but, as you can imagine, the prospect of getting all the users to remove their .msf/.pst/?? cache files is daunting, to say the least.
I've seen discussions about similar things happening with mbox format inboxes but they all seem to be in the context of some ancient 0.xx version of dovecot with known mbox processing bugs. But, I went from a reasonably current 1.1.3 version to 1.2.3. Right after seeing the problem, I upgraded to the just released 1.2.4 but either that didn't address the problem or, if it did, the reordering damage was already done by version 1.2.3.
So, my 2 questions are:
Anyone know why this would have happened and whether it is a dovecot problem or something I botched? I folded all the local config changes I had made to the 1.1.3 conf file into the stock 1.2.3 conf file so they should have been identical modulo changes in the distro. And, our local config changes are really minimal.
Is there anything I can do now other than getting all the users to remove their .msf/.pst/etc files? It sure would be grand if there was something I could do on the server side that would magically cause all the client-side caches to be invalidated. I'm hesitant to strip the inboxes of all the X-IMAPbase and X-UID lines because I'm not sure what the ramifications of that might be.
Here are some other random thoughts in case they might be related:
- The old server was running 32bit RHEL4 and the new one 64bit RHEL5
- I'm using the pre-built dovecot rpms from atrpms.net
- The dovecot index cache is kept in /var/cache/dovecot/indexes and I did not copy that over from the old server to the new assuming it would just be rebuilt.
Any and all help is greatly appreciated!
--Rob
=============== dovecot -n output follows ================== # 1.2.4: /etc/dovecot.conf # OS: Linux 2.6.18-128.4.1.el5 x86_64 Red Hat Enterprise Linux Server release 5.3 (Tikanga) protocols: imaps pop3s ssl: required ssl_cert_file: /etc/mail.pem ssl_key_file: /etc/mail.pem login_dir: /var/run/dovecot/login login_executable(default): /usr/libexec/dovecot/imap-login login_executable(imap): /usr/libexec/dovecot/imap-login login_executable(pop3): /usr/libexec/dovecot/pop3-login login_max_processes_count: 512 max_mail_processes: 2048 mail_max_userip_connections(default): 15 mail_max_userip_connections(imap): 15 mail_max_userip_connections(pop3): 10 first_valid_uid: 200 mail_location: mbox:~/mail/:INBOX=/var/spool/mail/%u:INDEX=/var/cache/dovecot/indexes/%u mail_executable(default): /usr/libexec/dovecot/imap mail_executable(imap): /usr/libexec/dovecot/imap mail_executable(pop3): /usr/libexec/dovecot/pop3 mail_plugin_dir(default): /usr/lib64/dovecot/imap mail_plugin_dir(imap): /usr/lib64/dovecot/imap mail_plugin_dir(pop3): /usr/lib64/dovecot/pop3 pop3_uidl_format(default): %08Xu%08Xv pop3_uidl_format(imap): %08Xu%08Xv pop3_uidl_format(pop3): %v.%u lda: postmaster_address: postmaster@example.com auth default: passdb: driver: pam userdb: driver: passwd
On Aug 21, 2009, at 8:36 PM, robh@cs.indiana.edu wrote:
Now, if you actually open the 5th message it shows the right message
body but the summary information is for the message that *used* to have X- UID 5 (which is now the 3rd message). So, the thunderbird cache is out of
sync and I can fix it by removing the .msf file for that mailbox but, as
you can imagine, the prospect of getting all the users to remove their .msf/.pst/?? cache files is daunting, to say the least.
One easy solution would be to change UIDVALIDITY (the large number in
X-IMAP: or X-IMAPbase: header) of each mailbox. Then the client will
redownload all mails.
I can't really think of why UIDs would have changed though. I think
v1.1's and v1.2's mbox handling code is pretty much the same.
Timo Sirainen wrote:
One easy solution would be to change UIDVALIDITY (the large number in X-IMAP: or X-IMAPbase: header) of each mailbox. Then the client will redownload all mails.
I just tried that on one of the inboxes and it seemed to do the trick!
I just inc'ed the number that was there by 1 but does it really matter
how I change it as long as it changes? I'm just thinking about how to
script this for all the inboxes so can I just change them all to the
same number for the sake of expediency or do they need to be unique, or
higher than the number there now, or ???
I can't really think of why UIDs would have changed though. I think v1.1's and v1.2's mbox handling code is pretty much the same.
I haven't had much time to study exactly how all the inboxes were changed but I did diff a couple from right before and right after the upgrade and there were *lots* of diffs in just the X-UID values. At a quick look, it sure seemed like a reordering had been done to remove the holes in the numbering in some kind of compaction-like operation.
Thanks!!!
--Rob
Timo Sirainen wrote:
One easy solution would be to change UIDVALIDITY (the large number in X-IMAP: or X-IMAPbase: header) of each mailbox. Then the client will redownload all mails.
This is what I ended up doing (just inc'ing the current UIDVALIDITY by
- and that seems to have worked for our IMAP users. However, this is more problematic for the pop users since it looks like that causes every message in the inbox to appear to be new (the new %v yields all new UIDLs so all the messages look like ones the client hasn't seen). I suppose it serves them right for using pop... ;-)
I can't really think of why UIDs would have changed though. I think v1.1's and v1.2's mbox handling code is pretty much the same.
I think I may have identified the problem. I have a test inbox that is very repeatably munged by dovecot 1.2.4 the first time it is accessed. The thing I noticed about it is that it has:
X-IMAPbase: 1076423160 0000059291 Junk $Label1 $Label3 $Label5 NonJunk $Forwarded $MDNSent $Label2 $Label4
However, the last message (with the largest X-UID) is:
X-UID: 59665
So, this UID 59665 is larger than last used UID on the X-IMAPbase line! I have to assume this is a bad thing, right? As a test, I changed the X-IMAPbase: line and set the last used UID properly and that was all it took to prevent dovecot from doing the reordering.
But, how did this happen? I know it was like this on several inboxes (maybe even most of them) and we had been running dovecot 1.1.3 previously for quite a while. So, was this a bug in 1.1.3? And, perhaps more importantly for others who may hit this same problem, is there some way that 1.2.x can recognize this condition and compensate for it without doing the really nasty reordering?
Thanks!
--Rob
On Aug 22, 2009, at 9:18 PM, Rob Henderson wrote:
X-IMAPbase: 1076423160 0000059291 Junk $Label1 $Label3 $Label5 NonJunk $Forwarded $MDNSent $Label2 $Label4
However, the last message (with the largest X-UID) is:
X-UID: 59665
So, this UID 59665 is larger than last used UID on the X-IMAPbase
line! I have to assume this is a bad thing, right?
Right.
But, how did this happen? I know it was like this on several inboxes (maybe even most of them) and we had been running dovecot 1.1.3 previously for quite a while. So, was this a bug in 1.1.3?
I think that's pretty likely. I'd have thought you had more problems
with mbox in 1.1.3, since looking at the NEWS file I see a lot of mbox
fixes since then..
And, perhaps more importantly for others who may hit this same problem, is there some way that 1.2.x can recognize this condition and compensate for it without doing the really nasty reordering?
Not really. I think it's anyway only a problem for people who were
running old 1.1 versions for a long time :) You probably would have
had the same problem when upgrading to a newer v1.1.
participants (3)
-
Rob Henderson
-
robh@cs.indiana.edu
-
Timo Sirainen