On Wed, 2008-06-25 at 12:00 -0400, David Halik wrote:
I just reproduced the environment and the index corrupted immediately across NFS because of the endian issue.
Jun 25 11:53:34 host IMAP(user): : Rebuilding index file /dovecot-index/index/user/.INBOX/dovecot.index: CPU architecture changed Jun 25 11:53:35 host IMAP(user): : Corrupted index cache file /dovecot-index/index/user/.INBOX/dovecot.index.cache: field header points outside file
I'll check later if I can reproduce this.
This was starting from a clean index, first opening pine on the NFS Solaris 9 sparc machine, and then at the same time opening pine on my Fedora 9 i386 workstation.
Why does it matter where you run Pine? Does it directly execute Dovecot on the local machine instead of connecting via TCP?
I'm going to try the idea of splitting the indexes into two different architectures, but I'm worried that this will not be feasible when we try to scale to our 80,000 users.
I'd suggest not running Dovecot on different architectures. Like if you're on a non-x86 make it connect via TCP to a x86 Dovecot server.
By the way, I don't think this is related to the corruption, but we also have tons of these in the logs:
Jun 25 11:52:32 host IMAP(user): : Created dotlock file's timestamp is different than current time (1214409234 vs 1214409152): /dovecot-index/control/user/.INBOX/dovecot-uidlist Jun 25 11:52:32 host IMAP(user): : Created dotlock file's timestamp is different than current time (1214409235 vs 1214409152): /dovecot-index/control/user/.INBOX/dovecot-uidlist
Dovecot really wants that clocks are synchronized between the NFS clients and the server. If the clock difference is more than 1 second, you'll get problems.