Hi I have been trying resolve my problem with dovecot for a few days and I dont have idea....
My environment is: dovecot director+5 dovecot guest
dovecot-2.2.36.4 from source Linux 3.16.0-11-amd64 storage via nfs (NetApp)
all works fine but when I update OS from debian 8 (kernel 3.16.x) to debian 9 (kernel 4.9.x ) sometimes I get random in logs: Broken dovecot-uidlist
examle: Error: Broken file /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist line 88: Invalid data:
(for random users - sometimes 10 error in day per node, some times more)
File looks ok
But if I change kernel to 3.16.x problem with "Broken file dovecot-uidlist" - not exists if turn to 4.9 or 5.x - problem exists
I have storage via nfs with opions: rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120 I tested with "nocto" or without "nocto" - nothing changes ......
nfs options in node: mmap_disable = yes mail_fsync = always
I bet the configuration is correct and I wonder why the problem occurs with other kernels 3.x.x - ok 4.x - not ok
I check and user who have problem did not connect to another node in this time
I dont have idea why problem exists on the kernel 4.x but not in 3.x
Hi Any idea some one ?
Dnia 13 stycznia 2021 15:56:18 CET, Maciej Milaszewski maciej.milaszewski@iq.pl napisał(a):
Hi I have been trying resolve my problem with dovecot for a few days and I dont have idea....
My environment is: dovecot director+5 dovecot guest
dovecot-2.2.36.4 from source Linux 3.16.0-11-amd64 storage via nfs (NetApp)
all works fine but when I update OS from debian 8 (kernel 3.16.x) to debian 9 (kernel 4.9.x ) sometimes I get random in logs: Broken dovecot-uidlist
examle: Error: Broken file /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist line 88: Invalid data:
(for random users - sometimes 10 error in day per node, some times more)
File looks ok
But if I change kernel to 3.16.x problem with "Broken file dovecot-uidlist" - not exists if turn to 4.9 or 5.x - problem exists
I have storage via nfs with opions: rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120 I tested with "nocto" or without "nocto" - nothing changes ......
nfs options in node: mmap_disable = yes mail_fsync = always
I bet the configuration is correct and I wonder why the problem occurs with other kernels 3.x.x - ok 4.x - not ok
I check and user who have problem did not connect to another node in this time
I dont have idea why problem exists on the kernel 4.x but not in 3.x
--
Hi Maciej,
I had the same issue when I switched dovecot backend from Cento 6 to Centos 7.
Also my configuration is similar to you, Dovecot Direcot, Dovecot backend that share Maildir via NFS on NetApp.
For local delivery of emails are you using LDA or LMTP? I'm using LDA.
Let me know.
Thanks
Il 13/01/21 15:56, Maciej Milaszewski ha scritto:
Hi I have been trying resolve my problem with dovecot for a few days and I dont have idea....
My environment is: dovecot director+5 dovecot guest
dovecot-2.2.36.4 from source Linux 3.16.0-11-amd64 storage via nfs (NetApp)
all works fine but when I update OS from debian 8 (kernel 3.16.x) to debian 9 (kernel 4.9.x ) sometimes I get random in logs: Broken dovecot-uidlist
examle: Error: Broken file /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist line 88: Invalid data:
(for random users - sometimes 10 error in day per node, some times more)
File looks ok
But if I change kernel to 3.16.x problem with "Broken file dovecot-uidlist" - not exists if turn to 4.9 or 5.x - problem exists
I have storage via nfs with opions: rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120 I tested with "nocto" or without "nocto" - nothing changes ......
nfs options in node: mmap_disable = yes mail_fsync = always
I bet the configuration is correct and I wonder why the problem occurs with other kernels 3.x.x - ok 4.x - not ok
I check and user who have problem did not connect to another node in this time
I dont have idea why problem exists on the kernel 4.x but not in 3.x
-- Alessio Cecchi Postmaster @ http://www.qboxmail.it https://www.linkedin.com/in/alessice
Hi I use lmtp and you ?
On 19.01.2021 10:45, Alessio Cecchi wrote:
Hi Maciej,
I had the same issue when I switched dovecot backend from Cento 6 to Centos 7.
Also my configuration is similar to you, Dovecot Direcot, Dovecot backend that share Maildir via NFS on NetApp.
For local delivery of emails are you using LDA or LMTP? I'm using LDA.
Let me know.
Thanks
Il 13/01/21 15:56, Maciej Milaszewski ha scritto:
Hi I have been trying resolve my problem with dovecot for a few days and I dont have idea....
My environment is: dovecot director+5 dovecot guest
dovecot-2.2.36.4 from source Linux 3.16.0-11-amd64 storage via nfs (NetApp)
all works fine but when I update OS from debian 8 (kernel 3.16.x) to debian 9 (kernel 4.9.x ) sometimes I get random in logs: Broken dovecot-uidlist
examle: Error: Broken file /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist line 88: Invalid data:
(for random users - sometimes 10 error in day per node, some times more)
File looks ok
But if I change kernel to 3.16.x problem with "Broken file dovecot-uidlist" - not exists if turn to 4.9 or 5.x - problem exists
I have storage via nfs with opions: rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120 I tested with "nocto" or without "nocto" - nothing changes ......
nfs options in node: mmap_disable = yes mail_fsync = always
I bet the configuration is correct and I wonder why the problem occurs with other kernels 3.x.x - ok 4.x - not ok
I check and user who have problem did not connect to another node in this time
I dont have idea why problem exists on the kernel 4.x but not in 3.x
-- Alessio Cecchi Postmaster @ http://www.qboxmail.it https://www.linkedin.com/in/alessice
-- Maciej Miłaszewski Starszy Administrator Systemowy IQ PL Sp. z o.o.
Biuro Obsługi Klienta: e-mail: bok@iq.pl tel.: +48 58 326 09 90 - 94 fax: +48 58 326 09 99
Dział pomocy: https://www.iq.pl/pomoc Informacja dotycząca przetwarzania danych osobowych: https://www.iq.pl/kontakt
IQ PL Sp. z o.o. z siedzibą w Gdańsku (80-298), ul. Geodetów 16, KRS 0000007725, Sąd rejestrowy: Sąd Rejonowy w Gdańsku VII Wydział KRS, kapitał zakładowy: 140.000 PLN, NIP 5832736211, REGON 192478853
Hi Maciej,
I'm using LDA for delivery email in mailbox (Maildir) and I think(hope) that switching to LMTP via director will fix my problem, but I d'ont know why wiht old kernel works and with recent no.
Are you using POP/IMAP and LMTP via director so any update to dovecot indexes is done from the same server?
Il 19/01/21 16:22, Maciej Milaszewski ha scritto:
Hi I use lmtp and you ?
On 19.01.2021 10:45, Alessio Cecchi wrote:
Hi Maciej,
I had the same issue when I switched dovecot backend from Cento 6 to Centos 7.
Also my configuration is similar to you, Dovecot Direcot, Dovecot backend that share Maildir via NFS on NetApp.
For local delivery of emails are you using LDA or LMTP? I'm using LDA.
Let me know.
Thanks
Il 13/01/21 15:56, Maciej Milaszewski ha scritto:
Hi I have been trying resolve my problem with dovecot for a few days and I dont have idea....
My environment is: dovecot director+5 dovecot guest
dovecot-2.2.36.4 from source Linux 3.16.0-11-amd64 storage via nfs (NetApp)
all works fine but when I update OS from debian 8 (kernel 3.16.x) to debian 9 (kernel 4.9.x ) sometimes I get random in logs: Broken dovecot-uidlist
examle: Error: Broken file /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist line 88: Invalid data:
(for random users - sometimes 10 error in day per node, some times more)
File looks ok
But if I change kernel to 3.16.x problem with "Broken file dovecot-uidlist" - not exists if turn to 4.9 or 5.x - problem exists
I have storage via nfs with opions: rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120 I tested with "nocto" or without "nocto" - nothing changes ......
nfs options in node: mmap_disable = yes mail_fsync = always
I bet the configuration is correct and I wonder why the problem occurs with other kernels 3.x.x - ok 4.x - not ok
I check and user who have problem did not connect to another node in this time
I dont have idea why problem exists on the kernel 4.x but not in 3.x
-- Alessio Cecchi Postmaster @ http://www.qboxmail.it https://www.linkedin.com/in/alessice
-- Alessio Cecchi Postmaster @ http://www.qboxmail.it https://www.linkedin.com/in/alessice
Hi I using pop/imap and LMTP via director and user go back in dovecot node
Current: 10.0.100.22 (expires 2021-01-22 17:42:44) Hashed: 10.0.100.22 Initial config: 10.0.100.22
I have 6 dovecot backands and index via local ssd disk mail_location = maildir:~/Maildir:INDEX=/var/dovecot_indexes%h
user never log in two different nodes in this same time
I update debian from 8 to 9 (and to 10) and tested via kerlnel 4.x and 5.x and problem exists If I change kernel to 3.16.x problem not exists I tested like:
problem exists: dovecot1-5 with 4.x and dovecot1-4 - with 3.19.x dovecot5 - with 4.x and dovecot1-5 - with 5.x and dovecot1-4 - with 4.x dovecot5 - with 5.x
not exists: dovecot1-5 - with 3.19.x
not exists: dovecot1-5 - with 3.19.x+kernel-care
I use NetAPP with mount options: rw,sec=sys,noexec,noatime,tcp,soft,rsize=32768,wsize=32768,intr,nordirplus,nfsvers=3,actimeo=120 I try with nocto and without nocto
big guys from NetApp says "nfs 4.x need auth via kerberos ...."
On 22.01.2021 16:08, Alessio Cecchi wrote:
Hi Maciej,
I'm using LDA for delivery email in mailbox (Maildir) and I think(hope) that switching to LMTP via director will fix my problem, but I d'ont know why wiht old kernel works and with recent no.
Are you using POP/IMAP and LMTP via director so any update to dovecot indexes is done from the same server?
Il 19/01/21 16:22, Maciej Milaszewski ha scritto:
Hi I use lmtp and you ?
On 19.01.2021 10:45, Alessio Cecchi wrote:
Hi Maciej,
I had the same issue when I switched dovecot backend from Cento 6 to Centos 7.
Also my configuration is similar to you, Dovecot Direcot, Dovecot backend that share Maildir via NFS on NetApp.
For local delivery of emails are you using LDA or LMTP? I'm using LDA.
Let me know.
Thanks
Il 13/01/21 15:56, Maciej Milaszewski ha scritto:
Hi I have been trying resolve my problem with dovecot for a few days and I dont have idea....
My environment is: dovecot director+5 dovecot guest
dovecot-2.2.36.4 from source Linux 3.16.0-11-amd64 storage via nfs (NetApp)
all works fine but when I update OS from debian 8 (kernel 3.16.x) to debian 9 (kernel 4.9.x ) sometimes I get random in logs: Broken dovecot-uidlist
examle: Error: Broken file /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist line 88: Invalid data:
(for random users - sometimes 10 error in day per node, some times more)
File looks ok
But if I change kernel to 3.16.x problem with "Broken file dovecot-uidlist" - not exists if turn to 4.9 or 5.x - problem exists
I have storage via nfs with opions: rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120 I tested with "nocto" or without "nocto" - nothing changes ......
nfs options in node: mmap_disable = yes mail_fsync = always
I bet the configuration is correct and I wonder why the problem occurs with other kernels 3.x.x - ok 4.x - not ok
I check and user who have problem did not connect to another node in this time
I dont have idea why problem exists on the kernel 4.x but not in 3.x
-- Alessio Cecchi Postmaster @ http://www.qboxmail.it https://www.linkedin.com/in/alessice
Alessio Cecchi Postmaster @ http://www.qboxmail.it https://www.linkedin.com/in/alessice
Hi,
after some tests I notice a difference in dovecot-uidlist line format when message is read from "old kernel" and "new kernel":
81184 G1611334252.M95445P32580.mail05.myserver.com :1611334252.M95445P32580.mail05.myserver.com,S=38689,W=39290 81185 G1611336004.M47750P3921.mail01.myserver.com :1611336004.M47750P3921.mail01.myserver.com,S=15917,W=16212 81186 G1611338535.M542784P10852.mail03.myserver.com :1611338535.M542784P10852.mail03.myserver.com,S=12651,W=12855 81187 G1611341375.M164702P13505.mail01.myserver.com :1611341375.M164702P13505.mail01.myserver.com,S=8795,W=8964 81189 G1611354389.M984432P14754.mail06.myserver.com :1611354389.M984432P14754.mail06.myserver.com,S=3038,W=3096 81191 :1611355746.M365669P10402.mail03.myserver.com,S=3049,W=3107 81193 :1611356442.M611719P20778.mail01.myserver.com,S=1203,W=1230 81194 G1611356752.M573233P27082.mail01.myserver.com :1611356752.M573233P27082.mail01.myserver.com,S=1210,W=1238 81195 G1611356991.M905681P30704.mail01.myserver.com :1611356991.M905681P30704.mail01.myserver.com,S=1220,W=1249 81197 :1611357210.M42178P1962.mail01.myserver.com,S=1220,W=1250 81199 :1611357560.M26894P7157.mail01.myserver.com,S=1233,W=1264
With "old kernel" (where all works fine) UID number are incremental and in the line there is one more field that start with "G1611...".
With "new kernel" (where error comes) UID number skip always a number and the field "G1611..." is missing.
Maciej, do you also have this behavior?
Why Dovecot create different uidlist line format with different kernel?
Il 22/01/21 17:50, Maciej Milaszewski ha scritto:
Hi I using pop/imap and LMTP via director and user go back in dovecot node
Current: 10.0.100.22 (expires 2021-01-22 17:42:44) Hashed: 10.0.100.22 Initial config: 10.0.100.22
I have 6 dovecot backands and index via local ssd disk mail_location = maildir:~/Maildir:INDEX=/var/dovecot_indexes%h
user never log in two different nodes in this same time
I update debian from 8 to 9 (and to 10) and tested via kerlnel 4.x and 5.x and problem exists If I change kernel to 3.16.x problem not exists I tested like:
problem exists: dovecot1-5 with 4.x and dovecot1-4 - with 3.19.x dovecot5 - with 4.x and dovecot1-5 - with 5.x and dovecot1-4 - with 4.x dovecot5 - with 5.x
not exists: dovecot1-5 - with 3.19.x
not exists: dovecot1-5 - with 3.19.x+kernel-care
I use NetAPP with mount options: rw,sec=sys,noexec,noatime,tcp,soft,rsize=32768,wsize=32768,intr,nordirplus,nfsvers=3,actimeo=120 I try with nocto and without nocto
big guys from NetApp says "nfs 4.x need auth via kerberos ...."
On 22.01.2021 16:08, Alessio Cecchi wrote:
Hi Maciej,
I'm using LDA for delivery email in mailbox (Maildir) and I think(hope) that switching to LMTP via director will fix my problem, but I d'ont know why wiht old kernel works and with recent no.
Are you using POP/IMAP and LMTP via director so any update to dovecot indexes is done from the same server?
Il 19/01/21 16:22, Maciej Milaszewski ha scritto:
Hi I use lmtp and you ?
On 19.01.2021 10:45, Alessio Cecchi wrote:
Hi Maciej,
I had the same issue when I switched dovecot backend from Cento 6 to Centos 7.
Also my configuration is similar to you, Dovecot Direcot, Dovecot backend that share Maildir via NFS on NetApp.
For local delivery of emails are you using LDA or LMTP? I'm using LDA.
Let me know.
Thanks
Il 13/01/21 15:56, Maciej Milaszewski ha scritto:
Hi I have been trying resolve my problem with dovecot for a few days and I dont have idea....
My environment is: dovecot director+5 dovecot guest
dovecot-2.2.36.4 from source Linux 3.16.0-11-amd64 storage via nfs (NetApp)
all works fine but when I update OS from debian 8 (kernel 3.16.x) to debian 9 (kernel 4.9.x ) sometimes I get random in logs: Broken dovecot-uidlist
examle: Error: Broken file /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist line 88: Invalid data:
(for random users - sometimes 10 error in day per node, some times more)
File looks ok
But if I change kernel to 3.16.x problem with "Broken file dovecot-uidlist" - not exists if turn to 4.9 or 5.x - problem exists
I have storage via nfs with opions: rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120 I tested with "nocto" or without "nocto" - nothing changes ......
nfs options in node: mmap_disable = yes mail_fsync = always
I bet the configuration is correct and I wonder why the problem occurs with other kernels 3.x.x - ok 4.x - not ok
I check and user who have problem did not connect to another node in this time
I dont have idea why problem exists on the kernel 4.x but not in 3.x
-- Alessio Cecchi Postmaster @ http://www.qboxmail.it https://www.linkedin.com/in/alessice
Alessio Cecchi Postmaster @ http://www.qboxmail.it https://www.linkedin.com/in/alessice
Hi For test I crete a new director with 2.3.13 and node 2.3.13 I mount storage via nfs with this same options:
rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120
I create a simple MTA and change MX to thi same like director1
In kernel 5.8.0-0.bpo.2-amd64 problem exists In kernel 3.x - not exists
In problem exists I check Maildir/dovecot-uidlist
3 V1424432537 N16208 G92c4ee0d93aa1260c629000009c4ba82 16144 :1611352119.M505834P25597.dovecot2,S=18282,W=18620 16145 :1611352123.M269121P19872.dovecot2,S=18266,W=18604 16146 :1611762747.M502108P9747.dovecot7,S=6595,W=6726 16150 :1611835594.M756718P9986.dovecot7,S=62439,W=63817 16163 :1611828091.M231204P5202.dovecot7,S=19348,W=19855 16208 :1611849420.M137743P24417.dovecot7,S=12064,W=12296 16209 :1611828091.M144806P5202.dovecot7,S=2806,W=2865 16210 :1611837438.M678475P12027.dovecot7,S=17713,W=18072 16211 :1611757939.M493064P7136.dovecot7,S=30783,W=31520 ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@$
If not exists:
16144 :1611352119.M505834P25597.dovecot2,S=18282,W=18620 16145 :1611352123.M269121P19872.dovecot2,S=18266,W=18604 16146 :1611762747.M502108P9747.dovecot7,S=6595,W=6726 16150 :1611835594.M756718P9986.dovecot7,S=62439,W=63817 16163 :1611828091.M231204P5202.dovecot7,S=19348,W=19855 16208 :1611849420.M137743P24417.dovecot7,S=12064,W=12296 16209 :1611828091.M144806P5202.dovecot7,S=2806,W=2865 16210 :1611837438.M678475P12027.dovecot7,S=17713,W=18072 16211 :1611757939.M493064P7136.dovecot7,S=30783,W=31520
On 23.01.2021 00:59, Alessio Cecchi wrote:
Hi,
after some tests I notice a difference in dovecot-uidlist line format when message is read from "old kernel" and "new kernel":
81184 G1611334252.M95445P32580.mail05.myserver.com :1611334252.M95445P32580.mail05.myserver.com,S=38689,W=39290 81185 G1611336004.M47750P3921.mail01.myserver.com :1611336004.M47750P3921.mail01.myserver.com,S=15917,W=16212 81186 G1611338535.M542784P10852.mail03.myserver.com :1611338535.M542784P10852.mail03.myserver.com,S=12651,W=12855 81187 G1611341375.M164702P13505.mail01.myserver.com :1611341375.M164702P13505.mail01.myserver.com,S=8795,W=8964 81189 G1611354389.M984432P14754.mail06.myserver.com :1611354389.M984432P14754.mail06.myserver.com,S=3038,W=3096 81191 :1611355746.M365669P10402.mail03.myserver.com,S=3049,W=3107 81193 :1611356442.M611719P20778.mail01.myserver.com,S=1203,W=1230 81194 G1611356752.M573233P27082.mail01.myserver.com :1611356752.M573233P27082.mail01.myserver.com,S=1210,W=1238 81195 G1611356991.M905681P30704.mail01.myserver.com :1611356991.M905681P30704.mail01.myserver.com,S=1220,W=1249 81197 :1611357210.M42178P1962.mail01.myserver.com,S=1220,W=1250 81199 :1611357560.M26894P7157.mail01.myserver.com,S=1233,W=1264
With "old kernel" (where all works fine) UID number are incremental and in the line there is one more field that start with "G1611...".
With "new kernel" (where error comes) UID number skip always a number and the field "G1611..." is missing.
Maciej, do you also have this behavior?
Why Dovecot create different uidlist line format with different kernel?
Il 22/01/21 17:50, Maciej Milaszewski ha scritto:
Hi I using pop/imap and LMTP via director and user go back in dovecot node
Current: 10.0.100.22 (expires 2021-01-22 17:42:44) Hashed: 10.0.100.22 Initial config: 10.0.100.22
I have 6 dovecot backands and index via local ssd disk mail_location = maildir:~/Maildir:INDEX=/var/dovecot_indexes%h
user never log in two different nodes in this same time
I update debian from 8 to 9 (and to 10) and tested via kerlnel 4.x and 5.x and problem exists If I change kernel to 3.16.x problem not exists I tested like:
problem exists: dovecot1-5 with 4.x and dovecot1-4 - with 3.19.x dovecot5 - with 4.x and dovecot1-5 - with 5.x and dovecot1-4 - with 4.x dovecot5 - with 5.x
not exists: dovecot1-5 - with 3.19.x
not exists: dovecot1-5 - with 3.19.x+kernel-care
I use NetAPP with mount options: rw,sec=sys,noexec,noatime,tcp,soft,rsize=32768,wsize=32768,intr,nordirplus,nfsvers=3,actimeo=120 I try with nocto and without nocto
big guys from NetApp says "nfs 4.x need auth via kerberos ...."
On 22.01.2021 16:08, Alessio Cecchi wrote:
Hi Maciej,
I'm using LDA for delivery email in mailbox (Maildir) and I think(hope) that switching to LMTP via director will fix my problem, but I d'ont know why wiht old kernel works and with recent no.
Are you using POP/IMAP and LMTP via director so any update to dovecot indexes is done from the same server?
Il 19/01/21 16:22, Maciej Milaszewski ha scritto:
Hi I use lmtp and you ?
On 19.01.2021 10:45, Alessio Cecchi wrote:
Hi Maciej,
I had the same issue when I switched dovecot backend from Cento 6 to Centos 7.
Also my configuration is similar to you, Dovecot Direcot, Dovecot backend that share Maildir via NFS on NetApp.
For local delivery of emails are you using LDA or LMTP? I'm using LDA.
Let me know.
Thanks
Il 13/01/21 15:56, Maciej Milaszewski ha scritto:
Hi I have been trying resolve my problem with dovecot for a few days and I dont have idea....
My environment is: dovecot director+5 dovecot guest
dovecot-2.2.36.4 from source Linux 3.16.0-11-amd64 storage via nfs (NetApp)
all works fine but when I update OS from debian 8 (kernel 3.16.x) to debian 9 (kernel 4.9.x ) sometimes I get random in logs: Broken dovecot-uidlist
examle: Error: Broken file /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist line 88: Invalid data:
(for random users - sometimes 10 error in day per node, some times more)
File looks ok
But if I change kernel to 3.16.x problem with "Broken file dovecot-uidlist" - not exists if turn to 4.9 or 5.x - problem exists
I have storage via nfs with opions: rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120 I tested with "nocto" or without "nocto" - nothing changes ......
nfs options in node: mmap_disable = yes mail_fsync = always
I bet the configuration is correct and I wonder why the problem occurs with other kernels 3.x.x - ok 4.x - not ok
I check and user who have problem did not connect to another node in this time
I dont have idea why problem exists on the kernel 4.x but not in 3.x
-- Alessio Cecchi Postmaster @ http://www.qboxmail.it https://www.linkedin.com/in/alessice
Alessio Cecchi Postmaster @ http://www.qboxmail.it https://www.linkedin.com/in/alessice
On 1/28/2021 11:14 AM, Maciej Milaszewski wrote:
Hi For test I crete a new director with 2.3.13 and node 2.3.13 I mount storage via nfs with this same options:
rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120
I create a simple MTA and change MX to thi same like director1
In kernel 5.8.0-0.bpo.2-amd64 problem exists In kernel 3.x - not exists
In problem exists I check Maildir/dovecot-uidlist
3 V1424432537 N16208 G92c4ee0d93aa1260c629000009c4ba82 16144 :1611352119.M505834P25597.dovecot2,S=18282,W=18620 16145 :1611352123.M269121P19872.dovecot2,S=18266,W=18604 16146 :1611762747.M502108P9747.dovecot7,S=6595,W=6726 16150 :1611835594.M756718P9986.dovecot7,S=62439,W=63817 16163 :1611828091.M231204P5202.dovecot7,S=19348,W=19855 16208 :1611849420.M137743P24417.dovecot7,S=12064,W=12296 16209 :1611828091.M144806P5202.dovecot7,S=2806,W=2865 16210 :1611837438.M678475P12027.dovecot7,S=17713,W=18072 16211 :1611757939.M493064P7136.dovecot7,S=30783,W=31520 ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@$
A block of zeros in a file opened for append is a classic NFSv3 race. Your mount options allow 120 seconds of attribute caching (actimeo=120). One of these attributes is the file size, which is also the end of file marker for append. If the file is changed by another client, the append mode writes will land on the wrong offset, possibly overwriting or punching holes.
If you use the "noac" mount option, this will reduce the window of vulnerability, but it will not eliminate it. It's also possible there is some issue in attribute caching in the 5.8 kernel. Do you have other options between 3.16 and 5.8?
The best fix is to use a more robust NFS dialect such as v4.2.
Tom.
If not exists:
16144 :1611352119.M505834P25597.dovecot2,S=18282,W=18620 16145 :1611352123.M269121P19872.dovecot2,S=18266,W=18604 16146 :1611762747.M502108P9747.dovecot7,S=6595,W=6726 16150 :1611835594.M756718P9986.dovecot7,S=62439,W=63817 16163 :1611828091.M231204P5202.dovecot7,S=19348,W=19855 16208 :1611849420.M137743P24417.dovecot7,S=12064,W=12296 16209 :1611828091.M144806P5202.dovecot7,S=2806,W=2865 16210 :1611837438.M678475P12027.dovecot7,S=17713,W=18072 16211 :1611757939.M493064P7136.dovecot7,S=30783,W=31520
On 23.01.2021 00:59, Alessio Cecchi wrote:
Hi,
after some tests I notice a difference in dovecot-uidlist line format when message is read from "old kernel" and "new kernel":
81184 G1611334252.M95445P32580.mail05.myserver.com :1611334252.M95445P32580.mail05.myserver.com,S=38689,W=39290 81185 G1611336004.M47750P3921.mail01.myserver.com :1611336004.M47750P3921.mail01.myserver.com,S=15917,W=16212 81186 G1611338535.M542784P10852.mail03.myserver.com :1611338535.M542784P10852.mail03.myserver.com,S=12651,W=12855 81187 G1611341375.M164702P13505.mail01.myserver.com :1611341375.M164702P13505.mail01.myserver.com,S=8795,W=8964 81189 G1611354389.M984432P14754.mail06.myserver.com :1611354389.M984432P14754.mail06.myserver.com,S=3038,W=3096 81191 :1611355746.M365669P10402.mail03.myserver.com,S=3049,W=3107 81193 :1611356442.M611719P20778.mail01.myserver.com,S=1203,W=1230 81194 G1611356752.M573233P27082.mail01.myserver.com :1611356752.M573233P27082.mail01.myserver.com,S=1210,W=1238 81195 G1611356991.M905681P30704.mail01.myserver.com :1611356991.M905681P30704.mail01.myserver.com,S=1220,W=1249 81197 :1611357210.M42178P1962.mail01.myserver.com,S=1220,W=1250 81199 :1611357560.M26894P7157.mail01.myserver.com,S=1233,W=1264
With "old kernel" (where all works fine) UID number are incremental and in the line there is one more field that start with "G1611...".
With "new kernel" (where error comes) UID number skip always a number and the field "G1611..." is missing.
Maciej, do you also have this behavior?
Why Dovecot create different uidlist line format with different kernel?
Il 22/01/21 17:50, Maciej Milaszewski ha scritto:
Hi I using pop/imap and LMTP via director and user go back in dovecot node
Current: 10.0.100.22 (expires 2021-01-22 17:42:44) Hashed: 10.0.100.22 Initial config: 10.0.100.22
I have 6 dovecot backands and index via local ssd disk mail_location = maildir:~/Maildir:INDEX=/var/dovecot_indexes%h
user never log in two different nodes in this same time
I update debian from 8 to 9 (and to 10) and tested via kerlnel 4.x and 5.x and problem exists If I change kernel to 3.16.x problem not exists I tested like:
problem exists: dovecot1-5 with 4.x and dovecot1-4 - with 3.19.x dovecot5 - with 4.x and dovecot1-5 - with 5.x and dovecot1-4 - with 4.x dovecot5 - with 5.x
not exists: dovecot1-5 - with 3.19.x
not exists: dovecot1-5 - with 3.19.x+kernel-care
I use NetAPP with mount options: rw,sec=sys,noexec,noatime,tcp,soft,rsize=32768,wsize=32768,intr,nordirplus,nfsvers=3,actimeo=120 I try with nocto and without nocto
big guys from NetApp says "nfs 4.x need auth via kerberos ...."
On 22.01.2021 16:08, Alessio Cecchi wrote:
Hi Maciej,
I'm using LDA for delivery email in mailbox (Maildir) and I think(hope) that switching to LMTP via director will fix my problem, but I d'ont know why wiht old kernel works and with recent no.
Are you using POP/IMAP and LMTP via director so any update to dovecot indexes is done from the same server?
Il 19/01/21 16:22, Maciej Milaszewski ha scritto:
Hi I use lmtp and you ?
On 19.01.2021 10:45, Alessio Cecchi wrote:
Hi Maciej,
I had the same issue when I switched dovecot backend from Cento 6 to Centos 7.
Also my configuration is similar to you, Dovecot Direcot, Dovecot backend that share Maildir via NFS on NetApp.
For local delivery of emails are you using LDA or LMTP? I'm using LDA.
Let me know.
Thanks
Il 13/01/21 15:56, Maciej Milaszewski ha scritto: > Hi > I have been trying resolve my problem with dovecot for a few days and I > dont have idea.... > > My environment is: dovecot director+5 dovecot guest > > dovecot-2.2.36.4 from source > Linux 3.16.0-11-amd64 > storage via nfs (NetApp) > > all works fine but when I update OS from debian 8 (kernel 3.16.x) to > debian 9 (kernel 4.9.x ) sometimes I get random in logs: > Broken dovecot-uidlist > > examle: > Error: Broken file > /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist line 88: > Invalid data: > > (for random users - sometimes 10 error in day per node, some times more) > > File looks ok > > But if I change kernel to 3.16.x problem with "Broken file > dovecot-uidlist" - not exists > if turn to 4.9 or 5.x - problem exists > > I have storage via nfs with opions: > rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120 > I tested with "nocto" or without "nocto" - nothing changes ...... > > nfs options in node: > mmap_disable = yes > mail_fsync = always > > I bet the configuration is correct and I wonder why the problem occurs > with other kernels > 3.x.x - ok > 4.x - not ok > > I check and user who have problem did not connect to another node in > this time > > I dont have idea why problem exists on the kernel 4.x but not in 3.x > >
Alessio Cecchi Postmaster @http://www.qboxmail.it https://www.linkedin.com/in/alessice
Alessio Cecchi Postmaster @http://www.qboxmail.it https://www.linkedin.com/in/alessice
Hi Probably netapp fas8200 not support NFS 4.2 and NFS 4.1 not support auth via exports (only kerberros)
On 28.01.2021 19:45, Tom Talpey wrote:
On 1/28/2021 11:14 AM, Maciej Milaszewski wrote:
Hi For test I crete a new director with 2.3.13 and node 2.3.13 I mount storage via nfs with this same options:
rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120
I create a simple MTA and change MX to thi same like director1
In kernel 5.8.0-0.bpo.2-amd64 problem exists In kernel 3.x - not exists
In problem exists I check Maildir/dovecot-uidlist
3 V1424432537 N16208 G92c4ee0d93aa1260c629000009c4ba82 16144 :1611352119.M505834P25597.dovecot2,S=18282,W=18620 16145 :1611352123.M269121P19872.dovecot2,S=18266,W=18604 16146 :1611762747.M502108P9747.dovecot7,S=6595,W=6726 16150 :1611835594.M756718P9986.dovecot7,S=62439,W=63817 16163 :1611828091.M231204P5202.dovecot7,S=19348,W=19855 16208 :1611849420.M137743P24417.dovecot7,S=12064,W=12296 16209 :1611828091.M144806P5202.dovecot7,S=2806,W=2865 16210 :1611837438.M678475P12027.dovecot7,S=17713,W=18072 16211 :1611757939.M493064P7136.dovecot7,S=30783,W=31520 ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@$
A block of zeros in a file opened for append is a classic NFSv3 race. Your mount options allow 120 seconds of attribute caching (actimeo=120). One of these attributes is the file size, which is also the end of file marker for append. If the file is changed by another client, the append mode writes will land on the wrong offset, possibly overwriting or punching holes.
If you use the "noac" mount option, this will reduce the window of vulnerability, but it will not eliminate it. It's also possible there is some issue in attribute caching in the 5.8 kernel. Do you have other options between 3.16 and 5.8?
The best fix is to use a more robust NFS dialect such as v4.2.
Tom.
If not exists:
16144 :1611352119.M505834P25597.dovecot2,S=18282,W=18620 16145 :1611352123.M269121P19872.dovecot2,S=18266,W=18604 16146 :1611762747.M502108P9747.dovecot7,S=6595,W=6726 16150 :1611835594.M756718P9986.dovecot7,S=62439,W=63817 16163 :1611828091.M231204P5202.dovecot7,S=19348,W=19855 16208 :1611849420.M137743P24417.dovecot7,S=12064,W=12296 16209 :1611828091.M144806P5202.dovecot7,S=2806,W=2865 16210 :1611837438.M678475P12027.dovecot7,S=17713,W=18072 16211 :1611757939.M493064P7136.dovecot7,S=30783,W=31520
On 23.01.2021 00:59, Alessio Cecchi wrote:
Hi,
after some tests I notice a difference in dovecot-uidlist line format when message is read from "old kernel" and "new kernel":
81184 G1611334252.M95445P32580.mail05.myserver.com :1611334252.M95445P32580.mail05.myserver.com,S=38689,W=39290 81185 G1611336004.M47750P3921.mail01.myserver.com :1611336004.M47750P3921.mail01.myserver.com,S=15917,W=16212 81186 G1611338535.M542784P10852.mail03.myserver.com :1611338535.M542784P10852.mail03.myserver.com,S=12651,W=12855 81187 G1611341375.M164702P13505.mail01.myserver.com :1611341375.M164702P13505.mail01.myserver.com,S=8795,W=8964 81189 G1611354389.M984432P14754.mail06.myserver.com :1611354389.M984432P14754.mail06.myserver.com,S=3038,W=3096 81191 :1611355746.M365669P10402.mail03.myserver.com,S=3049,W=3107 81193 :1611356442.M611719P20778.mail01.myserver.com,S=1203,W=1230 81194 G1611356752.M573233P27082.mail01.myserver.com :1611356752.M573233P27082.mail01.myserver.com,S=1210,W=1238 81195 G1611356991.M905681P30704.mail01.myserver.com :1611356991.M905681P30704.mail01.myserver.com,S=1220,W=1249 81197 :1611357210.M42178P1962.mail01.myserver.com,S=1220,W=1250 81199 :1611357560.M26894P7157.mail01.myserver.com,S=1233,W=1264
With "old kernel" (where all works fine) UID number are incremental and in the line there is one more field that start with "G1611...".
With "new kernel" (where error comes) UID number skip always a number and the field "G1611..." is missing.
Maciej, do you also have this behavior?
Why Dovecot create different uidlist line format with different kernel?
Il 22/01/21 17:50, Maciej Milaszewski ha scritto:
Hi I using pop/imap and LMTP via director and user go back in dovecot node
Current: 10.0.100.22 (expires 2021-01-22 17:42:44) Hashed: 10.0.100.22 Initial config: 10.0.100.22
I have 6 dovecot backands and index via local ssd disk mail_location = maildir:~/Maildir:INDEX=/var/dovecot_indexes%h
user never log in two different nodes in this same time
I update debian from 8 to 9 (and to 10) and tested via kerlnel 4.x and 5.x and problem exists If I change kernel to 3.16.x problem not exists I tested like:
problem exists: dovecot1-5 with 4.x and dovecot1-4 - with 3.19.x dovecot5 - with 4.x and dovecot1-5 - with 5.x and dovecot1-4 - with 4.x dovecot5 - with 5.x
not exists: dovecot1-5 - with 3.19.x
not exists: dovecot1-5 - with 3.19.x+kernel-care
I use NetAPP with mount options: rw,sec=sys,noexec,noatime,tcp,soft,rsize=32768,wsize=32768,intr,nordirplus,nfsvers=3,actimeo=120
I try with nocto and without nocto
big guys from NetApp says "nfs 4.x need auth via kerberos ...."
On 22.01.2021 16:08, Alessio Cecchi wrote:
Hi Maciej,
I'm using LDA for delivery email in mailbox (Maildir) and I think(hope) that switching to LMTP via director will fix my problem, but I d'ont know why wiht old kernel works and with recent no.
Are you using POP/IMAP and LMTP via director so any update to dovecot indexes is done from the same server?
Il 19/01/21 16:22, Maciej Milaszewski ha scritto:
Hi I use lmtp and you ?
On 19.01.2021 10:45, Alessio Cecchi wrote: > Hi Maciej, > > I had the same issue when I switched dovecot backend from Cento > 6 to > Centos 7. > > Also my configuration is similar to you, Dovecot Direcot, Dovecot > backend that share Maildir via NFS on NetApp. > > For local delivery of emails are you using LDA or LMTP? I'm > using LDA. > > Let me know. > > Thanks > > Il 13/01/21 15:56, Maciej Milaszewski ha scritto: >> Hi >> I have been trying resolve my problem with dovecot for a few >> days and I >> dont have idea.... >> >> My environment is: dovecot director+5 dovecot guest >> >> dovecot-2.2.36.4 from source >> Linux 3.16.0-11-amd64 >> storage via nfs (NetApp) >> >> all works fine but when I update OS from debian 8 (kernel >> 3.16.x) to >> debian 9 (kernel 4.9.x ) sometimes I get random in logs: >> Broken dovecot-uidlist >> >> examle: >> Error: Broken file >> /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist >> line 88: >> Invalid data: >> >> (for random users - sometimes 10 error in day per node, some >> times more) >> >> File looks ok >> >> But if I change kernel to 3.16.x problem with "Broken file >> dovecot-uidlist" - not exists >> if turn to 4.9 or 5.x - problem exists >> >> I have storage via nfs with opions: >> rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120 >> >> I tested with "nocto" or without "nocto" - nothing changes ...... >> >> nfs options in node: >> mmap_disable = yes >> mail_fsync = always >> >> I bet the configuration is correct and I wonder why the problem >> occurs >> with other kernels >> 3.x.x - ok >> 4.x - not ok >> >> I check and user who have problem did not connect to another >> node in >> this time >> >> I dont have idea why problem exists on the kernel 4.x but not >> in 3.x >> >> > -- > Alessio Cecchi > Postmaster @http://www.qboxmail.it > https://www.linkedin.com/in/alessice -- Alessio Cecchi Postmaster @http://www.qboxmail.it https://www.linkedin.com/in/alessice
Well, then "noac" may be your best solution. Or tracking down the possible issue with the 5.8 kernel client.
Either way, you should consider that NFSv3 will always be vulnerable to this. Remember, that protocol was standardized just over 25 years ago.
On 1/29/2021 7:16 AM, Maciej Milaszewski wrote:
Hi Probably netapp fas8200 not support NFS 4.2 and NFS 4.1 not support auth via exports (only kerberros)
On 28.01.2021 19:45, Tom Talpey wrote:
On 1/28/2021 11:14 AM, Maciej Milaszewski wrote:
Hi For test I crete a new director with 2.3.13 and node 2.3.13 I mount storage via nfs with this same options:
rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120
I create a simple MTA and change MX to thi same like director1
In kernel 5.8.0-0.bpo.2-amd64 problem exists In kernel 3.x - not exists
In problem exists I check Maildir/dovecot-uidlist
3 V1424432537 N16208 G92c4ee0d93aa1260c629000009c4ba82 16144 :1611352119.M505834P25597.dovecot2,S=18282,W=18620 16145 :1611352123.M269121P19872.dovecot2,S=18266,W=18604 16146 :1611762747.M502108P9747.dovecot7,S=6595,W=6726 16150 :1611835594.M756718P9986.dovecot7,S=62439,W=63817 16163 :1611828091.M231204P5202.dovecot7,S=19348,W=19855 16208 :1611849420.M137743P24417.dovecot7,S=12064,W=12296 16209 :1611828091.M144806P5202.dovecot7,S=2806,W=2865 16210 :1611837438.M678475P12027.dovecot7,S=17713,W=18072 16211 :1611757939.M493064P7136.dovecot7,S=30783,W=31520 ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@$
A block of zeros in a file opened for append is a classic NFSv3 race. Your mount options allow 120 seconds of attribute caching (actimeo=120). One of these attributes is the file size, which is also the end of file marker for append. If the file is changed by another client, the append mode writes will land on the wrong offset, possibly overwriting or punching holes.
If you use the "noac" mount option, this will reduce the window of vulnerability, but it will not eliminate it. It's also possible there is some issue in attribute caching in the 5.8 kernel. Do you have other options between 3.16 and 5.8?
The best fix is to use a more robust NFS dialect such as v4.2.
Tom.
If not exists:
16144 :1611352119.M505834P25597.dovecot2,S=18282,W=18620 16145 :1611352123.M269121P19872.dovecot2,S=18266,W=18604 16146 :1611762747.M502108P9747.dovecot7,S=6595,W=6726 16150 :1611835594.M756718P9986.dovecot7,S=62439,W=63817 16163 :1611828091.M231204P5202.dovecot7,S=19348,W=19855 16208 :1611849420.M137743P24417.dovecot7,S=12064,W=12296 16209 :1611828091.M144806P5202.dovecot7,S=2806,W=2865 16210 :1611837438.M678475P12027.dovecot7,S=17713,W=18072 16211 :1611757939.M493064P7136.dovecot7,S=30783,W=31520
On 23.01.2021 00:59, Alessio Cecchi wrote:
Hi,
after some tests I notice a difference in dovecot-uidlist line format when message is read from "old kernel" and "new kernel":
81184 G1611334252.M95445P32580.mail05.myserver.com :1611334252.M95445P32580.mail05.myserver.com,S=38689,W=39290 81185 G1611336004.M47750P3921.mail01.myserver.com :1611336004.M47750P3921.mail01.myserver.com,S=15917,W=16212 81186 G1611338535.M542784P10852.mail03.myserver.com :1611338535.M542784P10852.mail03.myserver.com,S=12651,W=12855 81187 G1611341375.M164702P13505.mail01.myserver.com :1611341375.M164702P13505.mail01.myserver.com,S=8795,W=8964 81189 G1611354389.M984432P14754.mail06.myserver.com :1611354389.M984432P14754.mail06.myserver.com,S=3038,W=3096 81191 :1611355746.M365669P10402.mail03.myserver.com,S=3049,W=3107 81193 :1611356442.M611719P20778.mail01.myserver.com,S=1203,W=1230 81194 G1611356752.M573233P27082.mail01.myserver.com :1611356752.M573233P27082.mail01.myserver.com,S=1210,W=1238 81195 G1611356991.M905681P30704.mail01.myserver.com :1611356991.M905681P30704.mail01.myserver.com,S=1220,W=1249 81197 :1611357210.M42178P1962.mail01.myserver.com,S=1220,W=1250 81199 :1611357560.M26894P7157.mail01.myserver.com,S=1233,W=1264
With "old kernel" (where all works fine) UID number are incremental and in the line there is one more field that start with "G1611...".
With "new kernel" (where error comes) UID number skip always a number and the field "G1611..." is missing.
Maciej, do you also have this behavior?
Why Dovecot create different uidlist line format with different kernel?
Il 22/01/21 17:50, Maciej Milaszewski ha scritto:
Hi I using pop/imap and LMTP via director and user go back in dovecot node
Current: 10.0.100.22 (expires 2021-01-22 17:42:44) Hashed: 10.0.100.22 Initial config: 10.0.100.22
I have 6 dovecot backands and index via local ssd disk mail_location = maildir:~/Maildir:INDEX=/var/dovecot_indexes%h
user never log in two different nodes in this same time
I update debian from 8 to 9 (and to 10) and tested via kerlnel 4.x and 5.x and problem exists If I change kernel to 3.16.x problem not exists I tested like:
problem exists: dovecot1-5 with 4.x and dovecot1-4 - with 3.19.x dovecot5 - with 4.x and dovecot1-5 - with 5.x and dovecot1-4 - with 4.x dovecot5 - with 5.x
not exists: dovecot1-5 - with 3.19.x
not exists: dovecot1-5 - with 3.19.x+kernel-care
I use NetAPP with mount options: rw,sec=sys,noexec,noatime,tcp,soft,rsize=32768,wsize=32768,intr,nordirplus,nfsvers=3,actimeo=120
I try with nocto and without nocto
big guys from NetApp says "nfs 4.x need auth via kerberos ...."
On 22.01.2021 16:08, Alessio Cecchi wrote:
Hi Maciej,
I'm using LDA for delivery email in mailbox (Maildir) and I think(hope) that switching to LMTP via director will fix my problem, but I d'ont know why wiht old kernel works and with recent no.
Are you using POP/IMAP and LMTP via director so any update to dovecot indexes is done from the same server?
Il 19/01/21 16:22, Maciej Milaszewski ha scritto: > Hi > I use lmtp and you ? > > On 19.01.2021 10:45, Alessio Cecchi wrote: >> Hi Maciej, >> >> I had the same issue when I switched dovecot backend from Cento >> 6 to >> Centos 7. >> >> Also my configuration is similar to you, Dovecot Direcot, Dovecot >> backend that share Maildir via NFS on NetApp. >> >> For local delivery of emails are you using LDA or LMTP? I'm >> using LDA. >> >> Let me know. >> >> Thanks >> >> Il 13/01/21 15:56, Maciej Milaszewski ha scritto: >>> Hi >>> I have been trying resolve my problem with dovecot for a few >>> days and I >>> dont have idea.... >>> >>> My environment is: dovecot director+5 dovecot guest >>> >>> dovecot-2.2.36.4 from source >>> Linux 3.16.0-11-amd64 >>> storage via nfs (NetApp) >>> >>> all works fine but when I update OS from debian 8 (kernel >>> 3.16.x) to >>> debian 9 (kernel 4.9.x ) sometimes I get random in logs: >>> Broken dovecot-uidlist >>> >>> examle: >>> Error: Broken file >>> /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist >>> line 88: >>> Invalid data: >>> >>> (for random users - sometimes 10 error in day per node, some >>> times more) >>> >>> File looks ok >>> >>> But if I change kernel to 3.16.x problem with "Broken file >>> dovecot-uidlist" - not exists >>> if turn to 4.9 or 5.x - problem exists >>> >>> I have storage via nfs with opions: >>> rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120 >>> >>> I tested with "nocto" or without "nocto" - nothing changes ...... >>> >>> nfs options in node: >>> mmap_disable = yes >>> mail_fsync = always >>> >>> I bet the configuration is correct and I wonder why the problem >>> occurs >>> with other kernels >>> 3.x.x - ok >>> 4.x - not ok >>> >>> I check and user who have problem did not connect to another >>> node in >>> this time >>> >>> I dont have idea why problem exists on the kernel 4.x but not >>> in 3.x >>> >>> >> -- >> Alessio Cecchi >> Postmaster @http://www.qboxmail.it >> https://www.linkedin.com/in/alessice
Alessio Cecchi Postmaster @http://www.qboxmail.it https://www.linkedin.com/in/alessice
Hi
we saw the same thing with EMC Unity and LMTP. With nfs v3 the situation is somehow tolerable.
First time we switch to nfs v4.1 the servers crashed and later we found out that we need to turn off delegation and switch to "lock_method = dotlock". This will help some people if is added to wiki page !
With those settings the performance was roughly the same but the mail.err log file was bigger so we switched back to nfs v3.
He do have a much smaller system with nfs v4.2 and no broken uidlist errors, only " Error: dotlock /.../dovecot-uidlist.lock was immediately deleted under us". Unfortunately EMC Unity doesn't support nfs v4.2 because is too new.
On 1/22/21 5:08 PM, Alessio Cecchi wrote:
Hi Maciej,
I'm using LDA for delivery email in mailbox (Maildir) and I think(hope) that switching to LMTP via director will fix my problem, but I d'ont know why wiht old kernel works and with recent no.
Are you using POP/IMAP and LMTP via director so any update to dovecot indexes is done from the same server?
-- Best regards, Adrian Minta
It's a long shot......but I would try to use nfsvers=4.1 in the nfs mount option (instead of nfsvers=3) - if your netapp supports it - with a newer kernel - 4.14-stable or 4.19-stable (if possible). The reason for that, is a nasty bug found in linux nfs client with older kernels...
https://about.gitlab.com/blog/2018/11/14/how-we-spent-two-weeks-hunting-an-n...
Hope this helps...
Regards,
Claudio
Em qua., 13 de jan. de 2021 às 12:18, Maciej Milaszewski < maciej.milaszewski@iq.pl> escreveu:
Hi I have been trying resolve my problem with dovecot for a few days and I dont have idea....
My environment is: dovecot director+5 dovecot guest
dovecot-2.2.36.4 from source Linux 3.16.0-11-amd64 storage via nfs (NetApp)
all works fine but when I update OS from debian 8 (kernel 3.16.x) to debian 9 (kernel 4.9.x ) sometimes I get random in logs: Broken dovecot-uidlist
examle: Error: Broken file /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist line 88: Invalid data:
(for random users - sometimes 10 error in day per node, some times more)
File looks ok
But if I change kernel to 3.16.x problem with "Broken file dovecot-uidlist" - not exists if turn to 4.9 or 5.x - problem exists
I have storage via nfs with opions:
rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120 I tested with "nocto" or without "nocto" - nothing changes ......
nfs options in node: mmap_disable = yes mail_fsync = always
I bet the configuration is correct and I wonder why the problem occurs with other kernels 3.x.x - ok 4.x - not ok
I check and user who have problem did not connect to another node in this time
I dont have idea why problem exists on the kernel 4.x but not in 3.x
Hi Claudio,
I made a test with NFS mount with nfsvers=4.1 and CentOS 7 as NFS client (our Netapp already have NFS 4.1 enabled) but the problem is still present.
More, I don't like to switch to NFS 4 because is statefull, NFS v3 is stateless and for example during maintanace or upgrade of NFS server clients haven't problems, the reboot of Netapp is trasparent.
I don't think the problem is related to Netapp, I see the same error in a setup of a customer based on Google Cloud (Ubuntu as Dovecot and NFS client and Google Cloud NFS volume as storage).
In my case I'm using LDA for local delivery of emails so I hope that swithcing to LMTP I will resolve the issue but I'm not use since others users said that they are aready using LMTP.
I don't know why on old Linux distro works and recents distro have the issue ...
Il 19/01/21 20:21, Claudio Cuqui ha scritto:
It's a long shot......but I would try to use nfsvers=4.1 in the nfs mount option (instead of nfsvers=3) - if your netapp supports it - with a newer kernel - 4.14-stable or 4.19-stable (if possible). The reason for that, is a nasty bug found in linux nfs client with older kernels...
https://about.gitlab.com/blog/2018/11/14/how-we-spent-two-weeks-hunting-an-n...
Hope this helps...
Regards,
Claudio
Em qua., 13 de jan. de 2021 às 12:18, Maciej Milaszewski
mailto:maciej.milaszewski@iq.pl> escreveu: Hi I have been trying resolve my problem with dovecot for a few days and I dont have idea.... My environment is: dovecot director+5 dovecot guest dovecot-2.2.36.4 from source Linux 3.16.0-11-amd64 storage via nfs (NetApp) all works fine but when I update OS from debian 8 (kernel 3.16.x) to debian 9 (kernel 4.9.x ) sometimes I get random in logs: Broken dovecot-uidlist examle: Error: Broken file /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist line 88: Invalid data: (for random users - sometimes 10 error in day per node, some times more) File looks ok But if I change kernel to 3.16.x problem with "Broken file dovecot-uidlist" - not exists if turn to 4.9 or 5.x - problem exists I have storage via nfs with opions: rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120 I tested with "nocto" or without "nocto" - nothing changes ...... nfs options in node: mmap_disable = yes mail_fsync = always I bet the configuration is correct and I wonder why the problem occurs with other kernels 3.x.x - ok 4.x - not ok I check and user who have problem did not connect to another node in this time I dont have idea why problem exists on the kernel 4.x but not in 3.x
-- Alessio Cecchi Postmaster @ http://www.qboxmail.it https://www.linkedin.com/in/alessice
Hi Try change kernel to older and test again
On 22.01.2021 15:45, Alessio Cecchi wrote:
Hi Claudio,
I made a test with NFS mount with nfsvers=4.1 and CentOS 7 as NFS client (our Netapp already have NFS 4.1 enabled) but the problem is still present.
More, I don't like to switch to NFS 4 because is statefull, NFS v3 is stateless and for example during maintanace or upgrade of NFS server clients haven't problems, the reboot of Netapp is trasparent.
I don't think the problem is related to Netapp, I see the same error in a setup of a customer based on Google Cloud (Ubuntu as Dovecot and NFS client and Google Cloud NFS volume as storage).
In my case I'm using LDA for local delivery of emails so I hope that swithcing to LMTP I will resolve the issue but I'm not use since others users said that they are aready using LMTP.
I don't know why on old Linux distro works and recents distro have the issue ...
Il 19/01/21 20:21, Claudio Cuqui ha scritto:
It's a long shot......but I would try to use nfsvers=4.1 in the nfs mount option (instead of nfsvers=3) - if your netapp supports it - with a newer kernel - 4.14-stable or 4.19-stable (if possible). The reason for that, is a nasty bug found in linux nfs client with older kernels...
https://about.gitlab.com/blog/2018/11/14/how-we-spent-two-weeks-hunting-an-n...
Hope this helps...
Regards,
Claudio
Em qua., 13 de jan. de 2021 às 12:18, Maciej Milaszewski
mailto:maciej.milaszewski@iq.pl> escreveu: Hi I have been trying resolve my problem with dovecot for a few days and I dont have idea.... My environment is: dovecot director+5 dovecot guest dovecot-2.2.36.4 from source Linux 3.16.0-11-amd64 storage via nfs (NetApp) all works fine but when I update OS from debian 8 (kernel 3.16.x) to debian 9 (kernel 4.9.x ) sometimes I get random in logs: Broken dovecot-uidlist examle: Error: Broken file /vmail2/po/pollygraf.xxx_pg_pollygraf/Maildir/dovecot-uidlist line 88: Invalid data: (for random users - sometimes 10 error in day per node, some times more) File looks ok But if I change kernel to 3.16.x problem with "Broken file dovecot-uidlist" - not exists if turn to 4.9 or 5.x - problem exists I have storage via nfs with opions: rw,sec=sys,noexec,noatime,tcp,hard,rsize=65536,wsize=65536,intr,nordirplus,nfsvers=3,tcp,actimeo=120 I tested with "nocto" or without "nocto" - nothing changes ...... nfs options in node: mmap_disable = yes mail_fsync = always I bet the configuration is correct and I wonder why the problem occurs with other kernels 3.x.x - ok 4.x - not ok I check and user who have problem did not connect to another node in this time I dont have idea why problem exists on the kernel 4.x but not in 3.x
-- Alessio Cecchi Postmaster @ http://www.qboxmail.it https://www.linkedin.com/in/alessice
-- Maciej Miłaszewski Starszy Administrator Systemowy IQ PL Sp. z o.o.
Biuro Obsługi Klienta: e-mail: bok@iq.pl tel.: +48 58 326 09 90 - 94 fax: +48 58 326 09 99
Dział pomocy: https://www.iq.pl/pomoc Informacja dotycząca przetwarzania danych osobowych: https://www.iq.pl/kontakt
IQ PL Sp. z o.o. z siedzibą w Gdańsku (80-298), ul. Geodetów 16, KRS 0000007725, Sąd rejestrowy: Sąd Rejonowy w Gdańsku VII Wydział KRS, kapitał zakładowy: 140.000 PLN, NIP 5832736211, REGON 192478853
participants (6)
-
Adrian Minta
-
Alessio Cecchi
-
Claudio Cuqui
-
Maciej Milaszewski
-
Maciej Milaszewski IQ PL
-
Tom Talpey