On 10 October 2018 at 19:12 William Taylor < william.taylor@sonic.com> wrote:


On Wed, Oct 10, 2018 at 09:37:46AM +0300, Aki Tuomi wrote:

On 09.10.2018 22:16, William Taylor wrote:
We have started seeing index corruption ever since we upgraded (we
believe) our imap servers from SL6 to Centos 7. Mail/Indexes are stored
on Netapps mounted via NFS. We have 2 lvs servers running surealived in
dr/wlc, 2 directors and 6 backend imap/pop servers.
Most of the core dumps I've looked at for different users are like
"Backtrace 2" with some variations on folder path.
This latest crash (Backtrace 1) is different from others I've seen.
It is also leaving 0byte files in the users .Drafts/tmp folder.
# ls -s /var/spool/mail/15/00/user1/.Drafts/tmp | awk '{print $1}'
|sort | uniq -c
9692 0
1 218600
I believe the number of cores here is different from the number of tmp
files because this is when we moved the user to our debug server so we
could get the core dumps.
# ls -la /home/u/user1/core.* |wc -l
8437
Any help/insight would be greatly appreciated.
Thanks,
William
>
OS Info:
CentOS Linux release 7.5.1804 (Core)
3.10.0-862.14.4.el7.x86_64
NFS:
# mount -t nfs |grep mail/15
172.16.255.14:/vol/vol1/mail/15 on /var/spool/mail/15 type nfs
(rw,nosuid,nodev,relatime,vers=3,rsize=32768,wsize=32768,namlen=255,hard,nordirplus,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=172.16.255.14,mountvers=3,mountport=4046,mountproto=udp,local_lock=none,addr=172.16.255.14)
Dovecot Info:
dovecot -n
# 2.1.17: /etc/dovecot/dovecot.conf

Hi!

Thank you for your report, however, 2.1.17 is VERY old version of
dovecot and this problem is very likely fixed in a more recent version.

Aki

I realize it is an older release.

Are you saying that there is a bug in this version that affects RHEL 7.5
but not RHEL 6 or just use the newest version and maybe the problem goes
away?

We have very limited interest in figuring out problems with (very) old dovecot versions. At minimum you need to show this problem with 2.2.36 or 2.3.2.1.

A thing you should make sure is that you are not accessing the user with two different servers concurrently.
---
Aki Tuomi