dovecot and broken uidlist

Maciej Milaszewski maciej.milaszewski at iq.pl
Fri Jan 29 14:16:30 EET 2021


Hi
Probably netapp fas8200 not support NFS 4.2 and NFS 4.1 not support auth
via exports (only kerberros)


On 28.01.2021 19:45, Tom Talpey wrote:
> On 1/28/2021 11:14 AM, Maciej Milaszewski wrote:
>> Hi
>> For test I crete a new director with 2.3.13 and node 2.3.13 I mount
>> storage via nfs with this same options:
>>
>> rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120
>>
>>
>> I create a simple MTA and change MX to thi same like director1
>>
>> In kernel 5.8.0-0.bpo.2-amd64 problem exists
>> In kernel 3.x - not exists
>>
>> In problem exists I check Maildir/dovecot-uidlist
>>
>> 3 V1424432537 N16208 G92c4ee0d93aa1260c629000009c4ba82
>> 16144 :1611352119.M505834P25597.dovecot2,S=18282,W=18620
>> 16145 :1611352123.M269121P19872.dovecot2,S=18266,W=18604
>> 16146 :1611762747.M502108P9747.dovecot7,S=6595,W=6726
>> 16150 :1611835594.M756718P9986.dovecot7,S=62439,W=63817
>> 16163 :1611828091.M231204P5202.dovecot7,S=19348,W=19855
>> 16208 :1611849420.M137743P24417.dovecot7,S=12064,W=12296
>> 16209 :1611828091.M144806P5202.dovecot7,S=2806,W=2865
>> 16210 :1611837438.M678475P12027.dovecot7,S=17713,W=18072
>> 16211 :1611757939.M493064P7136.dovecot7,S=30783,W=31520
>> ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@$
>>
>
> A block of zeros in a file opened for append is a classic NFSv3 race.
> Your mount options allow 120 seconds of attribute caching (actimeo=120).
> One of these attributes is the file size, which is also the end of file
> marker for append. If the file is changed by another client, the append
> mode writes will land on the wrong offset, possibly overwriting or
> punching holes.
>
> If you use the "noac" mount option, this will reduce the window of
> vulnerability, but it will not eliminate it. It's also possible there
> is some issue in attribute caching in the 5.8 kernel. Do you have
> other options between 3.16 and 5.8?
>
> The best fix is to use a more robust NFS dialect such as v4.2.
>
> Tom.
>
>> If not exists:
>>
>> 16144 :1611352119.M505834P25597.dovecot2,S=18282,W=18620
>> 16145 :1611352123.M269121P19872.dovecot2,S=18266,W=18604
>> 16146 :1611762747.M502108P9747.dovecot7,S=6595,W=6726
>> 16150 :1611835594.M756718P9986.dovecot7,S=62439,W=63817
>> 16163 :1611828091.M231204P5202.dovecot7,S=19348,W=19855
>> 16208 :1611849420.M137743P24417.dovecot7,S=12064,W=12296
>> 16209 :1611828091.M144806P5202.dovecot7,S=2806,W=2865
>> 16210 :1611837438.M678475P12027.dovecot7,S=17713,W=18072
>> 16211 :1611757939.M493064P7136.dovecot7,S=30783,W=31520
>>
>> On 23.01.2021 00:59, Alessio Cecchi wrote:
>>>
>>> Hi,
>>>
>>> after some tests I notice a difference in dovecot-uidlist line
>>> format when message is read from "old kernel" and "new kernel":
>>>
>>> 81184 G1611334252.M95445P32580.mail05.myserver.com
>>> :1611334252.M95445P32580.mail05.myserver.com,S=38689,W=39290
>>> 81185 G1611336004.M47750P3921.mail01.myserver.com
>>> :1611336004.M47750P3921.mail01.myserver.com,S=15917,W=16212
>>> 81186 G1611338535.M542784P10852.mail03.myserver.com
>>> :1611338535.M542784P10852.mail03.myserver.com,S=12651,W=12855
>>> 81187 G1611341375.M164702P13505.mail01.myserver.com
>>> :1611341375.M164702P13505.mail01.myserver.com,S=8795,W=8964
>>> 81189 G1611354389.M984432P14754.mail06.myserver.com
>>> :1611354389.M984432P14754.mail06.myserver.com,S=3038,W=3096
>>> 81191 :1611355746.M365669P10402.mail03.myserver.com,S=3049,W=3107
>>> 81193 :1611356442.M611719P20778.mail01.myserver.com,S=1203,W=1230
>>> 81194 G1611356752.M573233P27082.mail01.myserver.com
>>> :1611356752.M573233P27082.mail01.myserver.com,S=1210,W=1238
>>> 81195 G1611356991.M905681P30704.mail01.myserver.com
>>> :1611356991.M905681P30704.mail01.myserver.com,S=1220,W=1249
>>> 81197 :1611357210.M42178P1962.mail01.myserver.com,S=1220,W=1250
>>> 81199 :1611357560.M26894P7157.mail01.myserver.com,S=1233,W=1264
>>>
>>> With "old kernel" (where all works fine) UID number are incremental
>>> and in the line there is one more field that start with "G1611...".
>>>
>>> With "new kernel" (where error comes) UID number skip always a
>>> number and the field "G1611..." is missing.
>>>
>>> Maciej, do you also have this behavior?
>>>
>>> Why Dovecot create different uidlist line format with different kernel?
>>>
>>> Il 22/01/21 17:50, Maciej Milaszewski ha scritto:
>>>> Hi
>>>> I using pop/imap and LMTP via director and user go back in dovecot
>>>> node
>>>>
>>>> Current: 10.0.100.22 (expires 2021-01-22 17:42:44)
>>>> Hashed: 10.0.100.22
>>>> Initial config: 10.0.100.22
>>>>
>>>> I have 6 dovecot backands and index via local ssd disk
>>>> mail_location = maildir:~/Maildir:INDEX=/var/dovecot_indexes%h
>>>>
>>>> user never log in two different nodes in this same time
>>>>
>>>> I update debian from 8 to 9 (and to 10) and tested via kerlnel 4.x and
>>>> 5.x and problem exists
>>>> If I change kernel to 3.16.x problem not exists
>>>> I tested like:
>>>>
>>>> problem exists:
>>>> dovecot1-5 with 4.x
>>>> and
>>>> dovecot1-4 - with 3.19.x
>>>> dovecot5 - with 4.x
>>>> and
>>>> dovecot1-5 - with 5.x
>>>> and
>>>> dovecot1-4 - with 4.x
>>>> dovecot5 - with 5.x
>>>>
>>>> not exists:
>>>> dovecot1-5 - with 3.19.x
>>>>
>>>> not exists:
>>>> dovecot1-5 - with 3.19.x+kernel-care
>>>>
>>>> I use NetAPP with mount options:
>>>> rw,sec=sys,noexec,noatime,tcp,soft,rsize=32768,wsize=32768,intr,nordirplus,nfsvers=3,actimeo=120
>>>>
>>>> I try with nocto and without nocto
>>>>
>>>> big guys from NetApp says "nfs 4.x need auth via kerberos ...."
>>>>
>>>>
>>>>
>>>> On 22.01.2021 16:08, Alessio Cecchi wrote:
>>>>> Hi Maciej,
>>>>>
>>>>> I'm using LDA for delivery email in mailbox (Maildir) and I
>>>>> think(hope) that switching to LMTP via director will fix my problem,
>>>>> but I d'ont know why wiht old kernel works and with recent no.
>>>>>
>>>>> Are you using POP/IMAP and LMTP via director so any update to dovecot
>>>>> indexes is done from the same server?
>>>>>
>>>>> Il 19/01/21 16:22, Maciej Milaszewski ha scritto:
>>>>>> Hi
>>>>>> I use lmtp and you ?
>>>>>>
>>>>>> On 19.01.2021 10:45, Alessio Cecchi wrote:
>>>>>>> Hi Maciej,
>>>>>>>
>>>>>>> I had the same issue when I switched dovecot backend from Cento
>>>>>>> 6 to
>>>>>>> Centos 7.
>>>>>>>
>>>>>>> Also my configuration is similar to you, Dovecot Direcot, Dovecot
>>>>>>> backend that share Maildir via NFS on NetApp.
>>>>>>>
>>>>>>> For local delivery of emails are you using LDA or LMTP? I'm
>>>>>>> using LDA.
>>>>>>>
>>>>>>> Let me know.
>>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Il 13/01/21 15:56, Maciej Milaszewski ha scritto:
>>>>>>>> Hi
>>>>>>>> I have been trying resolve my problem with dovecot for a few
>>>>>>>> days and I
>>>>>>>> dont have idea....
>>>>>>>>
>>>>>>>> My environment is: dovecot director+5 dovecot guest
>>>>>>>>
>>>>>>>> dovecot-2.2.36.4 from source
>>>>>>>> Linux 3.16.0-11-amd64
>>>>>>>> storage via nfs (NetApp)
>>>>>>>>
>>>>>>>> all works fine but when I update OS from debian 8 (kernel
>>>>>>>> 3.16.x) to
>>>>>>>> debian 9 (kernel 4.9.x ) sometimes I get random in logs:
>>>>>>>> Broken dovecot-uidlist
>>>>>>>>
>>>>>>>> examle:
>>>>>>>> Error: Broken file
>>>>>>>> /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist
>>>>>>>> line 88:
>>>>>>>> Invalid data:
>>>>>>>>
>>>>>>>> (for random users - sometimes 10 error in day per node, some
>>>>>>>> times more)
>>>>>>>>
>>>>>>>> File looks ok
>>>>>>>>
>>>>>>>> But if I change kernel to 3.16.x problem with "Broken file
>>>>>>>> dovecot-uidlist"  - not exists
>>>>>>>> if turn to 4.9 or 5.x - problem exists
>>>>>>>>
>>>>>>>> I have storage via nfs with opions:
>>>>>>>> rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120
>>>>>>>>
>>>>>>>> I tested with "nocto" or without "nocto" - nothing changes ......
>>>>>>>>
>>>>>>>> nfs options in node:
>>>>>>>> mmap_disable = yes
>>>>>>>> mail_fsync = always
>>>>>>>>
>>>>>>>> I bet the configuration is correct and I wonder why the problem
>>>>>>>> occurs
>>>>>>>> with other kernels
>>>>>>>> 3.x.x - ok
>>>>>>>> 4.x - not ok
>>>>>>>>
>>>>>>>> I check and user who have problem did not connect to another
>>>>>>>> node in
>>>>>>>> this time
>>>>>>>>
>>>>>>>> I dont have idea why problem exists on the kernel 4.x but not
>>>>>>>> in 3.x
>>>>>>>>
>>>>>>>>
>>>>>>> -- 
>>>>>>> Alessio Cecchi
>>>>>>> Postmaster @http://www.qboxmail.it
>>>>>>> https://www.linkedin.com/in/alessice
>>>>> -- 
>>>>> Alessio Cecchi
>>>>> Postmaster @http://www.qboxmail.it
>>>>> https://www.linkedin.com/in/alessice
>>



More information about the dovecot mailing list