Hi Claudio,

I made a test with NFS mount with nfsvers=4.1 and CentOS 7 as NFS client (our Netapp already have NFS 4.1 enabled) but the problem is still present.

More, I don't like to switch to NFS 4 because is statefull, NFS v3 is stateless and for example during maintanace or upgrade of NFS server clients haven't problems, the reboot of Netapp is trasparent.

I don't think the problem is related to Netapp, I see the same error in a setup of a customer based on Google Cloud (Ubuntu as Dovecot and NFS client and Google Cloud NFS volume as storage).

In my case I'm using LDA for local delivery of emails so I hope that swithcing to LMTP I will resolve the issue but I'm not use since others users said that they are aready using LMTP.

I don't know why on old Linux distro works and recents distro have the issue ...

Il 19/01/21 20:21, Claudio Cuqui ha scritto:
It's a long shot......but I would try to use nfsvers=4.1 in the nfs mount option (instead of nfsvers=3)  - if your netapp supports it - with a newer kernel - 4.14-stable or 4.19-stable (if possible). The reason for that, is a nasty bug found in linux nfs client with older kernels...

https://about.gitlab.com/blog/2018/11/14/how-we-spent-two-weeks-hunting-an-nfs-bug/

Hope this helps...

Regards,

Claudio


Em qua., 13 de jan. de 2021 às 12:18, Maciej Milaszewski <maciej.milaszewski@iq.pl> escreveu:
Hi
I have been trying resolve my problem with dovecot for a few days and I
dont have idea....

My environment is: dovecot director+5 dovecot guest

dovecot-2.2.36.4 from source
Linux 3.16.0-11-amd64
storage via nfs (NetApp)

all works fine but when I update OS from debian 8 (kernel 3.16.x) to
debian 9 (kernel 4.9.x ) sometimes I get random in logs:
Broken dovecot-uidlist

examle:
Error: Broken file
/vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist line 88:
Invalid data:

(for random users - sometimes 10 error in day per node, some times more)

File looks ok

But if I change kernel to 3.16.x problem with "Broken file
dovecot-uidlist"  - not exists
if turn to 4.9 or 5.x - problem exists

I have storage via nfs with opions:
rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120
I tested with "nocto" or without "nocto" - nothing changes ......

nfs options in node:
mmap_disable = yes
mail_fsync = always

I bet the configuration is correct and I wonder why the problem occurs
with other kernels
3.x.x - ok
4.x - not ok

I check and user who have problem did not connect to another node in
this time

I dont have idea why problem exists on the kernel 4.x but not in 3.x


-- 
Alessio Cecchi
Postmaster @ http://www.qboxmail.it
https://www.linkedin.com/in/alessice