[Dovecot] ZFS Index corruption and Connection reset by peer
Hello,
I'm currently using dovecot 1.2.11 on FreeBSD 8.0 with ZFS filesystems.
So far, so good, it works quite nicely, but I have a couple glitches.
Each user has his own zfs partition, mounted on /home/<user> (easier to set per user quotas) and mail is stored in their home.
From day one, when people check their mail via imap, a lot of indexes corruption occured :
dovecot: IMAP(<user>@domain.org): Corrupted transaction log file /home/<user>/Mail/Maildir/INBOX/dovecot.index.log seq 13: record size too small +(type=0x0, offset=5560, size=0) (sync_offset=5652) dovecot: IMAP(<user>@domain.org): fscking index file /home/<user>/Mail/Maildir/INBOX/dovecot.index dovecot: IMAP(<user>@domain.org): /home/<user>/Mail/Maildir/INBOX/dovecot.index log position went backwards (13,132 < 13,5560) dovecot: IMAP(<user>@domain.org): Transaction log file /home/<user>/Mail/Maildir/INBOX/dovecot.index.log: marked corrupted
etc, etc...
After some digging, I "solved" this problem with mmap_disable = yes in dovecot.conf. Index corruption doesn't seem to occur anymore.
Is this normal? I thought this problem occured only on NFS filesystem and eventually on old versions of ZFS. Hasn't this been fixed?
Is there an option in ZFS that would allow mmap calls without corruption. Has it something to do with compression ?
The options of the zfs filesystem are :
zhome/username type filesystem - zhome/username creation Thu Mar 18 11:10 2010 - zhome/username used 750M - zhome/username available 61.0G - zhome/username referenced 750M - zhome/username compressratio 1.20x - zhome/username mounted yes - zhome/username quota none default zhome/username reservation none default zhome/username recordsize 128K default zhome/username mountpoint /usr/home/username inherited from zhome zhome/username sharenfs off default zhome/username checksum on default zhome/username compression lzjb inherited from zhome zhome/username atime off inherited from zhome zhome/username devices on default zhome/username exec on default zhome/username setuid on default zhome/username readonly off default zhome/username jailed off default zhome/username snapdir hidden default zhome/username aclmode groupmask default zhome/username aclinherit restricted default zhome/username canmount on default zhome/username shareiscsi off default zhome/username xattr off temporary zhome/username copies 1 default zhome/username version 3 - zhome/username utf8only off - zhome/username normalization none - zhome/username casesensitivity sensitive - zhome/username vscan off default zhome/username nbmand off default zhome/username sharesmb off default zhome/username refquota none default zhome/username refreservation none default zhome/username primarycache all default zhome/username secondarycache all default zhome/username usedbysnapshots 0 - zhome/username usedbydataset 750M - zhome/username usedbychildren 0 - zhome/username usedbyrefreservation 0 -
Other problem, that I have been unable to solve so far, is that a lot of entries show up in my logs about :
dovecot: imap-login: net_disconnect() failed: Connection reset by peer
It doesn't seem to have any adverse effect, except it's filling up the logs. It seem to appear when a client closes the imap connection? Is there a way to suppress this error?
Regard
P.C.
On ti, 2010-06-08 at 14:20 +0200, Philippe Chevalier wrote:
dovecot: IMAP(<user>@domain.org): Corrupted transaction log file /home/<user>/Mail/Maildir/INBOX/dovecot.index.log seq 13: record size too small +(type=0x0, offset=5560, size=0) (sync_offset=5652) .. After some digging, I "solved" this problem with mmap_disable = yes in dovecot.conf. Index corruption doesn't seem to occur anymore.
Is this normal? I thought this problem occured only on NFS filesystem and eventually on old versions of ZFS. Hasn't this been fixed?
Apparently it doesn't work perfectly..
Is there an option in ZFS that would allow mmap calls without corruption. Has it something to do with compression ?
I've no idea about ZFS.
Other problem, that I have been unable to solve so far, is that a lot of entries show up in my logs about :
dovecot: imap-login: net_disconnect() failed: Connection reset by peer
This means close() failed with:
[ECONNRESET] The underlying object was a stream socket that was shut down by the peer before all pending data was delivered.
This is the first time I've heard of this happening.. I see this shows up the first time in FreeBSD 6.3 man pages. Hmm. I don't like it. I guess I could work around it, but I think I'll first go complain about it to FreeBSD people.
On 06/08/2010 02:41 PM, Timo Sirainen wrote:
On ti, 2010-06-08 at 14:20 +0200, Philippe Chevalier wrote:
dovecot: IMAP(<user>@domain.org): Corrupted transaction log file /home/<user>/Mail/Maildir/INBOX/dovecot.index.log seq 13: record size too small +(type=0x0, offset=5560, size=0) (sync_offset=5652) .. After some digging, I "solved" this problem with mmap_disable = yes in dovecot.conf. Index corruption doesn't seem to occur anymore.
Is this normal? I thought this problem occured only on NFS filesystem and eventually on old versions of ZFS. Hasn't this been fixed?
Apparently it doesn't work perfectly..
Is there an option in ZFS that would allow mmap calls without corruption. Has it something to do with compression ?
I've no idea about ZFS.
Other problem, that I have been unable to solve so far, is that a lot of entries show up in my logs about :
dovecot: imap-login: net_disconnect() failed: Connection reset by peer
This means close() failed with:
[ECONNRESET] The underlying object was a stream socket that was shut down by the peer before all pending data was delivered.
This is the first time I've heard of this happening.. I see this shows up the first time in FreeBSD 6.3 man pages. Hmm. I don't like it. I guess I could work around it, but I think I'll first go complain about it to FreeBSD people.
I get the same error messages at FreeBSD 7.2 ( many of them )
Jun 08 15:01:24 IMAP(xxxxxxxx): Error: close(client out) failed: Connection reset by peer
On Jun 8, 2010, at 9:10 AM, Frank Bonnet wrote:
On 06/08/2010 02:41 PM, Timo Sirainen wrote:
On ti, 2010-06-08 at 14:20 +0200, Philippe Chevalier wrote:
dovecot: IMAP(<user>@domain.org): Corrupted transaction log file /home/<user>/Mail/Maildir/INBOX/dovecot.index.log seq 13: record size too small +(type=0x0, offset=5560, size=0) (sync_offset=5652) .. After some digging, I "solved" this problem with mmap_disable = yes in dovecot.conf. Index corruption doesn't seem to occur anymore.
Is this normal? I thought this problem occured only on NFS filesystem and eventually on old versions of ZFS. Hasn't this been fixed?
Apparently it doesn't work perfectly..
I quit using mmap_disable around 7.1-STABLE and haven't had that bug since then. I'm running 8.0-R with Maildirs in a compressed ZFS dataset right now with no problems. That's pretty odd...I'm pretty sure it was in the implementation and had nothing to do with the ZFS version but I assume your datasets and pools are all updated to the latest version?
Is there an option in ZFS that would allow mmap calls without corruption. Has it something to do with compression ?
I've no idea about ZFS.
You should check it out, it's rad!
Other problem, that I have been unable to solve so far, is that a lot of entries show up in my logs about :
dovecot: imap-login: net_disconnect() failed: Connection reset by peer
This means close() failed with:
[ECONNRESET] The underlying object was a stream socket that was shut down by the peer before all pending data was delivered.
This is the first time I've heard of this happening.. I see this shows up the first time in FreeBSD 6.3 man pages. Hmm. I don't like it. I guess I could work around it, but I think I'll first go complain about it to FreeBSD people.
I get the same error messages at FreeBSD 7.2 ( many of them )
Jun 08 15:01:24 IMAP(xxxxxxxx): Error: close(client out) failed: Connection reset by peer
I've seen this a FEW times. Like 3 in the last six months. seems to have gone away after updating to 1.2..though maybe I just haven't triggered it again.
On Wed, Jun 16, 2010 at 12:23:16PM -0400, Dillon Kass wrote:
I quit using mmap_disable around 7.1-STABLE and haven't had that bug since then. I'm running 8.0-R with Maildirs in a compressed ZFS dataset right now with no problems. That's pretty odd...I'm pretty sure it was in the implementation and had nothing to do with the ZFS version but I assume your datasets and pools are all updated to the latest version?
# uname -v FreeBSD 8.0-STABLE #2: Wed May 12 21:13:40 CEST 2010
# zfs upgrade This system is currently running ZFS filesystem version 3.
All filesystems are formatted with the current version.
# zpool upgrade This system is currently running ZFS pool version 14.
All pools are formatted using this version.
I can't really be more up to date than this...
Only thing is that Maildirs are all on different datasets, since every user has his own set.
mmap_disable made the problem completely go away.
Jun 08 15:01:24 IMAP(xxxxxxxx): Error: close(client out) failed: Connection reset by peer
I've seen this a FEW times. Like 3 in the last six months. seems to have gone away after updating to 1.2..though maybe I just haven't triggered it again.
I have one around every 5 minutes.
Jun 17 13:28:33 xxxx dovecot: imap-login: net_disconnect() failed: Connection reset by peer Jun 17 13:38:33 xxxx last message repeated 3 times Jun 17 13:39:42 xxxx dovecot: imap-login: net_disconnect() failed: Connection reset by peer Jun 17 13:55:33 xxxx last message repeated 2 times Jun 17 14:19:42 xxxx dovecot: imap-login: net_disconnect() failed: Connection reset by peer Jun 17 14:25:42 xxxx dovecot: imap-login: net_disconnect() failed: Connection reset by peer Jun 17 14:42:33 xxxx dovecot: imap-login: net_disconnect() failed: Connection reset by peer Jun 17 14:48:33 xxxx dovecot: imap-login: net_disconnect() failed: Connection reset by peer
I guess it occurs when users are polling their mailbox, and maybe only with specific clients?
I have no clue.
P.C.
On Thu, 2010-06-17 at 14:55 +0200, Philippe Chevalier wrote:
Jun 08 15:01:24 IMAP(xxxxxxxx): Error: close(client out) failed: Connection reset by peer
I've seen this a FEW times. Like 3 in the last six months. seems to have gone away after updating to 1.2..though maybe I just haven't triggered it again.
I have one around every 5 minutes.
Jun 17 13:28:33 xxxx dovecot: imap-login: net_disconnect() failed: Connection reset by peer Jun 17 13:38:33 xxxx last message repeated 3 times
Here are fixes:
http://hg.dovecot.org/dovecot-2.0/rev/c24ee1ebb159 http://hg.dovecot.org/dovecot-2.0/rev/b2ffb6846973
participants (5)
-
Dillon Kass
-
Frank Bonnet
-
Philippe Chevalier
-
Philippe Chevalier
-
Timo Sirainen