[Dovecot] NFS issues
Adam McDougall
mcdouga9 at egr.msu.edu
Thu Mar 25 17:32:45 EET 2010
On 03/25/10 06:58, Brian Candler wrote:
> On Tue, Mar 23, 2010 at 03:19:49PM +0200, Timo Sirainen wrote:
>>> I have done some small-scale testing and it looks fine.
>>
>> Stress testing by running imaptest for same user's same mailbox in 2+ different servers (i.e. two NFS clients reading/writing same mailbox files) should show up quickly what kind of errors you could get. http://imapwiki.org/ImapTest
>
> OK, I've now set this up:
>
> ImapTest ---> dovecot (same host) -----> NFS server
> `---> dovecot (diff host) ----'
>
> * 172.16.23.104: dovecot 1.2.11 and ImapTest-latest. FreeBSD 7.2.
> * 172.16.23.101: dovecot 1.2.11 only. FreeBSD 7.2.
> * 172.16.23.103: NFS server. Ubuntu Karmic.
>
> All three hosts are ntpd synced.
>
> The following was needed on the FreeBSD boxes to get fcntl locking working:
>
> nfs_client_enable="YES"
> rpc_lockd_enable="YES"
> rpc_statd_enable="YES"
>
> (imapd worked without these, but maillog showed errors about failing to
> obtain locks, "operation not supported")
>
> Test results
> ------------
>
> * Pointing a single instance of imaptest at a single host, or two instances
> of imaptest at the same host (with clients=5 to avoid hitting the 15 client
> limit) was fine. ImapTest reported no errors, and nothing out of the ordinary
> in maillog.
>
> $ egrep -v "Login:|Disconnected:|Aborted login" /var/log/maillog
>
> * Things went badly wrong with two instances of imaptest pointing at
> different dovecot hosts. I had seen this sort of thing when I'd previously
> been using dot locking, and was hoping they'd be fixed by switching to
> fcntl, but unfortunately not.
>
> ImapTest reported errors including:
>
> Error: brian at dev.example.com[8]: SELECT failed: 8.3 NO [SERVERBUG] Internal error occurred. Refer to server log for more information. [2010-03-25 10:22:23]
> - 6 stalled for 16 secs in command: 11 EXPUNGE
>
> All sorts of errors reported in maillog, including:
>
> Mar 25 10:22:23 freebsd-dev dovecot: IMAP(brian at dev.example.com): fscking index file /mail/0/6/37/30/brian%dev.example.com/dovecot.index
> Mar 25 10:22:23 freebsd-dev dovecot: IMAP(brian at dev.example.com): Transaction log /mail/0/6/37/30/brian%dev.example.com/dovecot.index.log: duplicate transaction log sequence (10)
> Mar 25 10:22:23 freebsd-dev dovecot: IMAP(brian at dev.example.com): Our dotlock file /mail/0/6/37/30/brian%dev.example.com/dovecot-uidlist.lock was overridden (locked 0 secs ago, touched 0 secs ago)
> Mar 25 10:22:23 freebsd-dev dovecot: IMAP(brian at dev.example.com): fscking index file /mail/0/6/37/30/brian%dev.example.com/dovecot.index
> Mar 25 10:22:23 freebsd-dev dovecot: IMAP(brian at dev.example.com): Transaction log /mail/0/6/37/30/brian%dev.example.com/dovecot.index.log: duplicate transaction log sequence (11)
> Mar 25 10:22:27 freebsd-dev dovecot: IMAP(brian at dev.example.com): /mail/0/6/37/30/brian%dev.example.com/dovecot.index reset, view is now inconsistent
> Mar 25 10:22:46 freebsd-dev dovecot: IMAP(brian at dev.example.com): Panic: file mail-transaction-log-view.c: line 108 (mail_transaction_log_view_set): assertion failed: (min_file_seq<= max_file_seq)
> Mar 25 10:22:48 freebsd-dev dovecot: IMAP(brian at dev.example.com): rename(/mail/0/6/37/30/brian%dev.example.com/dovecot-uidlist.tmp, /mail/0/6/37/30/brian%dev.example.com/dovecot-uidlist) failed: No such file or directory
> Mar 25 10:22:48 freebsd-dev dovecot: IMAP(brian at dev.example.com): unlink(/mail/0/6/37/30/brian%dev.example.com/dovecot-uidlist.tmp) failed: No such file or directory
>
> Mar 25 10:22:36 wipe-dev dovecot: IMAP(brian at dev.example.com): ftruncate(/mail/0/6/37/30/brian%dev.example.com/dovecot-uidlist.lock) failed: Stale NFS file handle
>
> (Logs from a single test run are attached)
>
> Interestingly, these messages imply that dovecot is still using dotlocking
> in some circumstances, even though I've definitely set fcntl locking.
>
> $ grep ^lock /usr/local/etc/dovecot.conf
> lock_method = fcntl
>
> $ egrep '^mail_nfs|^mmap' /usr/local/etc/dovecot.conf
> mmap_disable = yes
> mail_nfs_storage = yes
> mail_nfs_index = yes
>
> All this suggests I should use some sort of 'sticky' load balancing in front
> so that all client conns from one IP hit the same frontend box. However,
> that contradicts the experience Adam McDougall has had with a similar setup:
>
> http://dovecot.org/list/dovecot/2010-March/047815.html
>
> It's possible that switching the Linux NFS server to a Netapp will help
> (which is what it will be deployed onto eventually anyway)
>
> Adam: did you do any tuning of FreeBSD client NFS settings? And have you
> tried using ImapTest, or just real IMAP users?
>
> I see there are a few tunables:
>
> $ grep nfs /etc/defaults/rc.conf
> netfs_types="nfs:NFS nfs4:NFS4 smbfs:SMB portalfs:PORTAL nwfs:NWFS" # Net filesystems.
> nfs_client_enable="NO" # This host is an NFS client (or NO).
> nfs_access_cache="60" # Client cache timeout in seconds
> nfs_server_enable="NO" # This host is an NFS server (or NO).
> nfs_server_flags="-u -t -n 4" # Flags to nfsd (if enabled).
> nfs_reserved_port_only="NO" # Provide NFS only on secure port (or NO).
> nfs_bufpackets="" # bufspace (in packets) for client
>
> I have tried rerunning with
> sysctl vfs.nfs.access_cache_timeout=0
> but saw the same problems.
>
> Maybe the load pattern from 'real' IMAP clients is such that these problems
> generally don't show in practice? (i.e. it would be unusual for a single
> IMAP client to make simultaneous changes to the same folder via different
> TCP connections)
>
> Regards,
>
> Brian.
I use:
rc.conf:
nfs_client_enable="YES" # This host is an NFS client (or NO).
rpc_lockd_enable="YES"
rpc_statd_enable="YES"
/etc/fstab:
nfsserver:/vol/mail /egr/mail nfs rw,bg,tcp,nosuid 0 0
dovecot.conf: (some other things that helped in general, not necessarily
locking related, some got line wrapped)
login_max_processes_count: 512
max_mail_processes: 1024
mail_max_userip_connections: 25
mail_location:
maildir:%h/Maildir:CONTROL=%h/Maildir/dovecot/private/control:INDEX=%h/Maildir/dovecot/private/indexes
mmap_disable: yes
mail_nfs_storage: yes
mail_nfs_index: yes
mail_process_size: 1024
mail_log_max_lines_per_sec: 0
auth default:
worker_max_request_count: 500
# internal note: lock_method is "always dotlock for maildir" according
to dovecot author
#lock_method = fcntl
I have played with the access cache but ultimately nothing resulted from
it so I leave it un-tuned.
I have not tried imaptest with my servers but I just let them run with
real clients, as long as I am not messing around with the back end files
in bad ways, I don't really get the errors you turned up in real use. I
have seen plenty of them in earlier versions of dovecot before there was
code to flush the FreeBSD access cache well enough. I believe I
remember Timo saying something about the timestamps on the NetApp NFS
server being much more fine grained than some other NFS servers which
could be helping me out.
More information about the dovecot
mailing list