On 1/28/2021 11:14 AM, Maciej Milaszewski wrote:
Hi For test I crete a new director with 2.3.13 and node 2.3.13 I mount storage via nfs with this same options:
rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120
I create a simple MTA and change MX to thi same like director1
In kernel 5.8.0-0.bpo.2-amd64 problem exists In kernel 3.x - not exists
In problem exists I check Maildir/dovecot-uidlist
3 V1424432537 N16208 G92c4ee0d93aa1260c629000009c4ba82 16144 :1611352119.M505834P25597.dovecot2,S=18282,W=18620 16145 :1611352123.M269121P19872.dovecot2,S=18266,W=18604 16146 :1611762747.M502108P9747.dovecot7,S=6595,W=6726 16150 :1611835594.M756718P9986.dovecot7,S=62439,W=63817 16163 :1611828091.M231204P5202.dovecot7,S=19348,W=19855 16208 :1611849420.M137743P24417.dovecot7,S=12064,W=12296 16209 :1611828091.M144806P5202.dovecot7,S=2806,W=2865 16210 :1611837438.M678475P12027.dovecot7,S=17713,W=18072 16211 :1611757939.M493064P7136.dovecot7,S=30783,W=31520 ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@$
A block of zeros in a file opened for append is a classic NFSv3 race. Your mount options allow 120 seconds of attribute caching (actimeo=120). One of these attributes is the file size, which is also the end of file marker for append. If the file is changed by another client, the append mode writes will land on the wrong offset, possibly overwriting or punching holes.
If you use the "noac" mount option, this will reduce the window of vulnerability, but it will not eliminate it. It's also possible there is some issue in attribute caching in the 5.8 kernel. Do you have other options between 3.16 and 5.8?
The best fix is to use a more robust NFS dialect such as v4.2.
Tom.
If not exists:
16144 :1611352119.M505834P25597.dovecot2,S=18282,W=18620 16145 :1611352123.M269121P19872.dovecot2,S=18266,W=18604 16146 :1611762747.M502108P9747.dovecot7,S=6595,W=6726 16150 :1611835594.M756718P9986.dovecot7,S=62439,W=63817 16163 :1611828091.M231204P5202.dovecot7,S=19348,W=19855 16208 :1611849420.M137743P24417.dovecot7,S=12064,W=12296 16209 :1611828091.M144806P5202.dovecot7,S=2806,W=2865 16210 :1611837438.M678475P12027.dovecot7,S=17713,W=18072 16211 :1611757939.M493064P7136.dovecot7,S=30783,W=31520
On 23.01.2021 00:59, Alessio Cecchi wrote:
Hi,
after some tests I notice a difference in dovecot-uidlist line format when message is read from "old kernel" and "new kernel":
81184 G1611334252.M95445P32580.mail05.myserver.com :1611334252.M95445P32580.mail05.myserver.com,S=38689,W=39290 81185 G1611336004.M47750P3921.mail01.myserver.com :1611336004.M47750P3921.mail01.myserver.com,S=15917,W=16212 81186 G1611338535.M542784P10852.mail03.myserver.com :1611338535.M542784P10852.mail03.myserver.com,S=12651,W=12855 81187 G1611341375.M164702P13505.mail01.myserver.com :1611341375.M164702P13505.mail01.myserver.com,S=8795,W=8964 81189 G1611354389.M984432P14754.mail06.myserver.com :1611354389.M984432P14754.mail06.myserver.com,S=3038,W=3096 81191 :1611355746.M365669P10402.mail03.myserver.com,S=3049,W=3107 81193 :1611356442.M611719P20778.mail01.myserver.com,S=1203,W=1230 81194 G1611356752.M573233P27082.mail01.myserver.com :1611356752.M573233P27082.mail01.myserver.com,S=1210,W=1238 81195 G1611356991.M905681P30704.mail01.myserver.com :1611356991.M905681P30704.mail01.myserver.com,S=1220,W=1249 81197 :1611357210.M42178P1962.mail01.myserver.com,S=1220,W=1250 81199 :1611357560.M26894P7157.mail01.myserver.com,S=1233,W=1264
With "old kernel" (where all works fine) UID number are incremental and in the line there is one more field that start with "G1611...".
With "new kernel" (where error comes) UID number skip always a number and the field "G1611..." is missing.
Maciej, do you also have this behavior?
Why Dovecot create different uidlist line format with different kernel?
Il 22/01/21 17:50, Maciej Milaszewski ha scritto:
Hi I using pop/imap and LMTP via director and user go back in dovecot node
Current: 10.0.100.22 (expires 2021-01-22 17:42:44) Hashed: 10.0.100.22 Initial config: 10.0.100.22
I have 6 dovecot backands and index via local ssd disk mail_location = maildir:~/Maildir:INDEX=/var/dovecot_indexes%h
user never log in two different nodes in this same time
I update debian from 8 to 9 (and to 10) and tested via kerlnel 4.x and 5.x and problem exists If I change kernel to 3.16.x problem not exists I tested like:
problem exists: dovecot1-5 with 4.x and dovecot1-4 - with 3.19.x dovecot5 - with 4.x and dovecot1-5 - with 5.x and dovecot1-4 - with 4.x dovecot5 - with 5.x
not exists: dovecot1-5 - with 3.19.x
not exists: dovecot1-5 - with 3.19.x+kernel-care
I use NetAPP with mount options: rw,sec=sys,noexec,noatime,tcp,soft,rsize=32768,wsize=32768,intr,nordirplus,nfsvers=3,actimeo=120 I try with nocto and without nocto
big guys from NetApp says "nfs 4.x need auth via kerberos ...."
On 22.01.2021 16:08, Alessio Cecchi wrote:
Hi Maciej,
I'm using LDA for delivery email in mailbox (Maildir) and I think(hope) that switching to LMTP via director will fix my problem, but I d'ont know why wiht old kernel works and with recent no.
Are you using POP/IMAP and LMTP via director so any update to dovecot indexes is done from the same server?
Il 19/01/21 16:22, Maciej Milaszewski ha scritto:
Hi I use lmtp and you ?
On 19.01.2021 10:45, Alessio Cecchi wrote:
Hi Maciej,
I had the same issue when I switched dovecot backend from Cento 6 to Centos 7.
Also my configuration is similar to you, Dovecot Direcot, Dovecot backend that share Maildir via NFS on NetApp.
For local delivery of emails are you using LDA or LMTP? I'm using LDA.
Let me know.
Thanks
Il 13/01/21 15:56, Maciej Milaszewski ha scritto: > Hi > I have been trying resolve my problem with dovecot for a few days and I > dont have idea.... > > My environment is: dovecot director+5 dovecot guest > > dovecot-2.2.36.4 from source > Linux 3.16.0-11-amd64 > storage via nfs (NetApp) > > all works fine but when I update OS from debian 8 (kernel 3.16.x) to > debian 9 (kernel 4.9.x ) sometimes I get random in logs: > Broken dovecot-uidlist > > examle: > Error: Broken file > /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist line 88: > Invalid data: > > (for random users - sometimes 10 error in day per node, some times more) > > File looks ok > > But if I change kernel to 3.16.x problem with "Broken file > dovecot-uidlist" - not exists > if turn to 4.9 or 5.x - problem exists > > I have storage via nfs with opions: > rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120 > I tested with "nocto" or without "nocto" - nothing changes ...... > > nfs options in node: > mmap_disable = yes > mail_fsync = always > > I bet the configuration is correct and I wonder why the problem occurs > with other kernels > 3.x.x - ok > 4.x - not ok > > I check and user who have problem did not connect to another node in > this time > > I dont have idea why problem exists on the kernel 4.x but not in 3.x > >
Alessio Cecchi Postmaster @http://www.qboxmail.it https://www.linkedin.com/in/alessice
Alessio Cecchi Postmaster @http://www.qboxmail.it https://www.linkedin.com/in/alessice