[Dovecot] Unlock non existent locks
Hello Timo,
I'm running
a single instance of dovecot-2.1.15 on a single host running 8.3-RELEASE-p3 FreeBSD amd64 mailboxes (Maildir), control files and indexes are on NFS (v3,tcp)
mail_nfs_storage = yes
lock_method = fcntl
[didn't touch the following]
# Mail index files also exist in NFS. Setting this to yes requires
# mmap_disable=yes and fsync_disable=no.
mail_nfs_index = yes
served by an Isilon s200 node (OneFS 6.5.5.22) procmail delivers in the same location through postfix-2.8.7
The filer shows *a lot* of such messages :
2013-08-02T14:12:29+02:00 <0.5> XXXX-10(id10) /boot/kernel.amd64/kernel: [lkf_delegate.c:2752](pid 46390="kt: dwt3")(tid=101282) dev_local_lkf_unlock(): no lock entry present to unlock for resource: 1:19d5:fdbe ;client: 0xa51cc3f444107
Corresponding file may be a message, a maildir, an index, ...
I can experience the same message with a simple fcntl C program which tries to unlock an NFS file without prior locking of it.
However, the problem occurs only on a FreeBSD client (I tried the old nfs client and the new (mount_newnfs), not on a 2.6.32-358.11.1.el6.x86_64 CentOS release 6.4 (Final) Linux release.
So my guess is that dovecot has some safety mechanism which tries to unlock locked files, maybe after some timeout and that, in the case of a an already unlocked file, the FreeBSD client sends the unlock RPC request to the server anyway whereas the linux client does not, noticing there isn't anything to unlock.
Can you help me explaining such a behavior ? Are those "unlock a file with no prior lock" made on purpose or is it a bug ? Would it be an application or a RPC bug ?
Can you think of another reason ?
Thanks
-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Groupe Exploitation et Infrastructure
On Fri, 2013-08-02 at 14:30 +0200, Thomas Hummel wrote:
Hello Timo,
I'm running
a single instance of dovecot-2.1.15 on a single host running 8.3-RELEASE-p3 FreeBSD amd64 mailboxes (Maildir), control files and indexes are on NFS (v3,tcp)
mail_nfs_storage = yes lock_method = fcntl [didn't touch the following] # Mail index files also exist in NFS. Setting this to yes requires # mmap_disable=yes and fsync_disable=no. mail_nfs_index = yes
Since you have only one Dovecot accessing the NFS, you don't need either mail_nfs_storage=yes or mail_nfs_index=yes. My guess is that by setting those to "no", you'll also solve this:
2013-08-02T14:12:29+02:00 <0.5> XXXX-10(id10) /boot/kernel.amd64/kernel: [lkf_delegate.c:2752](pid 46390="kt: dwt3")(tid=101282) dev_local_lkf_unlock(): no lock entry present to unlock for resource: 1:19d5:fdbe ;client: 0xa51cc3f444107
On Fri, Aug 02, 2013 at 03:38:47PM +0300, Timo Sirainen wrote:
Since you have only one Dovecot accessing the NFS, you don't need either mail_nfs_storage=yes or mail_nfs_index=yes. My guess is that by setting those to "no", you'll also solve this:
2013-08-02T14:12:29+02:00 <0.5> XXXX-10(id10) /boot/kernel.amd64/kernel: [lkf_delegate.c:2752](pid 46390="kt: dwt3")(tid=101282) dev_local_lkf_unlock(): no lock entry present to unlock for resource: 1:19d5:fdbe ;client: 0xa51cc3f444107
Thanks, I'll try that but that wouldn't be a good solution for when I'd want to use more dovecot servers anyway.
My tests using a simple fcntl() C program (unlock a non-locked file) is giving me a hard time figuring out what conditions cause the file server to log this message :
On some Isilon node/destination IP combinations, I'd have the message, on other I wouldn't.
Still, it seems I cannot reproduce the problem on any other hosts (mounting this server) than the one running the dovecot server. Oddly enough, my fcntl test causes this message even with dovecot stopped, statd and lockd restarted and the filesystem un-and-re-mounted while the same setup (up to date via FreeBSD update, i.e. same base, same nfs client) would not make the server log the message.
Anyway, the problem seems harmless. But is it legit that dovecot try to unlock non (or no more) locked files as it seems ?
Thanks for your help.
-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Groupe Exploitation et Infrastructure
On 5.8.2013, at 19.34, Thomas Hummel <hummel@pasteur.fr> wrote:
On Fri, Aug 02, 2013 at 03:38:47PM +0300, Timo Sirainen wrote:
Since you have only one Dovecot accessing the NFS, you don't need either mail_nfs_storage=yes or mail_nfs_index=yes. My guess is that by setting those to "no", you'll also solve this:
2013-08-02T14:12:29+02:00 <0.5> XXXX-10(id10) /boot/kernel.amd64/kernel: [lkf_delegate.c:2752](pid 46390="kt: dwt3")(tid=101282) dev_local_lkf_unlock(): no lock entry present to unlock for resource: 1:19d5:fdbe ;client: 0xa51cc3f444107
Thanks, I'll try that but that wouldn't be a good solution for when I'd want to use more dovecot servers anyway.
mail_nfs_*=yes wouldn't be a good solution even when you add more servers! Director is the only safe way to do it, and mail_nfs_*=yes isn't required with director.
Anyway, the problem seems harmless. But is it legit that dovecot try to unlock non (or no more) locked files as it seems ?
The NFS workarounds code is doing some ugly stuff. I thought it would have, but looking at the code it doesn't seem so. But still easier to debug if you first see if the problem is with the NFS workarounds or the lib-index code. With lib-index you could also use lock_method=dotlock to see if that works better (although performance will be slightly worse also then).
On Mon, Aug 05, 2013 at 07:38:59PM +0300, Timo Sirainen wrote:
The NFS workarounds code is doing some ugly stuff. I thought it would have, but looking at the code it doesn't seem so. But still easier to debug if you first see if the problem is with the NFS workarounds or the lib-index code. With lib-index you could also use lock_method=dotlock to see if that works better (although performance will be slightly worse also then).
I just tested mail_nfs_storage and mail_nfs_index both set to no : log messages are still here. They only stop when I stop dovecot.
Note :
The nfs mount is made through a private network (192.168.3.x). My simple fcntl unlock test, when the isilon mount is made through the same private network behave as dovecot (i.e. make the node log warning about non-existent ressource to unlock), but not when the mount is made through the "public" network (157.99.x.x). Even when dovecot is down and statd/lockd restarted.
I thought about some reverse name lookups but I don't think this is the problem.
Something very strange is happening, not matter what nfs workarounds dovecot is making. But it seems to involve this host and only this host...
-- Thomas Hummel | Institut Pasteur <hummel@pasteur.fr> | Groupe Exploitation et Infrastructure
On 6.8.2013, at 19.44, Thomas Hummel <hummel@pasteur.fr> wrote:
On Mon, Aug 05, 2013 at 07:38:59PM +0300, Timo Sirainen wrote:
The NFS workarounds code is doing some ugly stuff. I thought it would have, but looking at the code it doesn't seem so. But still easier to debug if you first see if the problem is with the NFS workarounds or the lib-index code. With lib-index you could also use lock_method=dotlock to see if that works better (although performance will be slightly worse also then).
I just tested mail_nfs_storage and mail_nfs_index both set to no : log messages are still here. They only stop when I stop dovecot.
What about with lock_method=dotlock? After that there should be no NFS locking in Dovecot at all.
Anyway, I don't think Dovecot is trying to unlock already unlocked files. The locking APIs that Dovecot uses makes that rather difficult to do.
participants (2)
-
Thomas Hummel
-
Timo Sirainen