On Mon, 2009-02-23 at 16:27 +0100, Ulrich Zehl wrote:
On the client side, it's Linux 2.6.23.16. All attribute cache related values are at their default, as far as I can tell. The entry in fstab reads:
nfs-server:/srv/storage /srv/storage nfs rw,nfsvers=3,hard,intr,nosuid,noexec,nodev,noatime 0 0
Setting actimeo=0 probably fixes this, but also probably increases the load a lot. actimeo=1 might work ok and reduce how often these problems happen, but not eliminate them completely.
Dovecot's nfs settings should avoid this problem though. You could see if upgrading your kernel helps. Some kernels have somewhat broken NFS code.
I did
# mount -o remount,actimeo=0 /srv/storage
To both servers?
So allina modified dovecot-uidlist and soon afterwards laura probably was using a cached dovecot-uidlist and corrupted it.
Since the corrupted files are available for a little while (in the example, it was ~ 15 minutes), will it help if I repeatedly check all dovecot-uidlists and save those found to be corrupted to a special directory, so that we can see what the corruption actually is?
I suppose looking at a couple of those could verify if it's really just NFS caching related corruption or something else.