[Dovecot] NFS

Miquel van Smoorenburg miquels at cistron.nl
Wed Apr 26 14:26:31 EEST 2006


On Tue, 2006-04-25 at 17:33 +0300, Timo Sirainen wrote:
> I've finally managed to run Dovecot without errors in two computers with
> maildir and indexes stored in NFS. I added a page to wiki about this:
> 
> http://wiki.dovecot.org/NFS
> 
> Suggestions how to keep attribute cache enabled but to allow Dovecot to
> specifically request not to use a cached value (when it's important)
> would be welcome.

Timo, you use Debian - if you have liblockfile-dev installed, the
manpage of lockfile-create(3) has a section called "REMOTE FILE SYSTEMS
AND THE KERNEL ATTRIBUTE CACHE".

See the part on chmod() there on how to invalidate the attribute cache
for one file on Linux.

Manpage is attached.

Mike.
-------------- next part --------------
LOCKFILE_CREATE(3)	   Linux Programmer's Manual	    LOCKFILE_CREATE(3)



NAME
       lockfile_create, lockfile_remove, lockfile_touch, lockfile_check - man-
       age lockfiles

SYNOPSIS
       #include <lockfile.h>

       cc [ flag ... ] file ... -llockfile [ library ]

       int lockfile_create( const char *lockfile, int retrycnt, int flags );
       int lockfile_remove( const char *lockfile );
       int lockfile_touch( const char *lockfile );
       int lockfile_check( const char *lockfile, int flags  );

DESCRIPTION
       The lockfile_create function creates a lockfile in an NFS safe way.

       If flags is set to L_PID then lockfile_create will not only  check  for
       an  existing  lockfile, but it will read the contents as well to see if
       it contains a process id in ASCII. If so, the lockfile is only valid if
       that process still exists.

       If  the	lockfile is on a shared filesystem, it might have been created
       by a process on a remote host. Thus the process-id checking is  useless
       and  the  L_PID	flag should not be set. In this case, there is no good
       way to see if a lockfile is stale. Therefore if the lockfile  is  older
       then  5	minutes,  it  will  be removed. That is why the lockfile_touch
       function is provided: while holding the lock, it needs to be  refreshed
       regulary (every minute or so) by calling lockfile_touch ()  .

       The  lockfile_check  function  checks  if  a  valid lockfile is already
       present without trying to create a new lockfile.

       Finally the lockfile_remove function removes the lockfile.


RETURN VALUES
       lockfile_create returns one of the following status codes:

	  #define L_SUCCESS   0    /* Lockfile created			   */
	  #define L_NAMELEN   1    /* Recipient name too long (> 13 chars) */
	  #define L_TMPLOCK   2    /* Error creating tmp lockfile	   */
	  #define L_TMPWRITE  3    /* Can't write pid int tmp lockfile	   */
	  #define L_MAXTRYS   4    /* Failed after max. number of attempts */
	  #define L_ERROR     5    /* Unknown error; check errno	   */

       lockfile_check returns 0 if a valid lockfile is present. If no lockfile
       or no valid lockfile is present, -1 is returned.

       lockfile_touch  and  lockfile_remove return 0 on success. On failure -1
       is returned and errno is set appropriately. It is not an error to lock-
       file_remove() a non-existing lockfile.


ALGORITHM
       The  algorithm that is used to create a lockfile in an atomic way, even
       over NFS, is as follows:

       1      A unique file is created. In printf format, the name of the file
	      is .lk%05d%x%s. The first argument (%05d) is the current process
	      id. The second argument (%x) consists of the 4 minor bits of the
	      value  returned  by  time(2).  The  last	argument is the system
	      hostname.


       2      Then the lockfile is created using link(2). The return value  of
	      link is ignored.


       3      Now  the	lockfile is stat()ed. If the stat fails, we go to step
	      6.


       4      The stat value of the lockfile is compared with that of the tem-
	      porary  file. If they are the same, we have the lock. The tempo-
	      rary file is deleted and a value of 0 (success) is  returned  to
	      the caller.


       5      A  check is made to see if the existing lockfile is a valid one.
	      If it isn't valid, the stale lockfile is deleted.


       6      Before retrying, we sleep for n seconds. n is initially  5  sec-
	      onds,  but  after  every	retry 5 extra seconds is added up to a
	      maximum of 60 seconds (an incremental backoff). Then  we	go  to
	      step 2 up to retries times.



REMOTE FILE SYSTEMS AND THE KERNEL ATTRIBUTE CACHE
       These functions do not lock a file - they generate a lockfile.  However
       in a lot of cases, such	as  Unix  mailboxes,  all  concerned  programs
       accessing  the  mailboxes agree on the fact that the presence of <file-
       name>.lock means that <filename> is locked.

       If you are using lockfile_create to  create  a  lock  on  a  file  that
       resides	on  a  remote server, and you already have that file open, you
       need to flush the NFS attribute cache after locking. This is needed  to
       prevent the following scenario:

       o  open /var/mail/USERNAME
       o  attributes, such as size, inode, etc are now cached in the kernel!
       o  meanwhile, another remote system appends data to /var/mail/USERNAME
       o  grab lock using lockfile_create()
       o  seek to end of file
       o  write data

       Now  the  end of the file really isn't the end of the file - the kernel
       cached the attributes on open, and st_size is not the end of  the  file
       anymore.  So  after  locking  the  file, you need to tell the kernel to
       flush the NFS file attribute cache.

       The only portable way to do this is  the  POSIX	fcntl()  file  locking
       primitives - locking a file using fcntl() has the fortunate side-effect
       of invalidating the NFS file attribute cache of the kernel.

       lockfile_create() cannot do this for you for two reasons. One, it  just
       creates	a lockfile- it doesn't know which file you are actually trying
       to lock! Two, even if it could deduce the file you're locking from  the
       filename,  by  just  opening  and  closing  it, it would invalidate any
       existing POSIX locks the program might already have on that file  (yes,
       POSIX locking semantics are insane!).

       So basically what you need to do is something like this:

	 fd = open("/var/mail/USER");
	 .. program code ..

	 lockfile_create("/var/mail/USER.lock", x, y);

	 /* Invalidate NFS attribute cache using POSIX locks */
	 if (lockf(fd, F_TLOCK, 0) == 0) lockf(fd, F_ULOCK, 0);

       You  have to be careful with this if you're putting this in an existing
       program that might already be using fcntl(), flock() or	lockf()  lock-
       ing- you might invalidate existing locks.



       There  is  also	a non-portable way. A lot of NFS operations return the
       updated attributes - and the Linux kernel actually uses these to update
       the attribute cache. One of these operations is chmod(2).

       So stat()ing a file and then chmod()ing it to st.st_mode will not actu-
       ally change the file, nor will it interfere with any locks on the file,
       but  it will invalidate the attribute cache. The equivalent to use from
       a shell script would be

	 chmod u=u /var/mail/USER


PERMISSIONS
       If you are on a system that has a mail spool  directory	that  is  only
       writable  by a special group (usually "mail") you cannot create a lock-
       file directly in the mailspool directory without special permissions.

       Lockfile_create and lockfile_remove  check  if  the  lockfile  ends  in
       $USERNAME.lock,	and if the directory the lockfile is writable by group
       "mail". If so, an external set group-id mail executable (dotlockfile(1)
       ) is spawned to do the actual locking / unlocking.


FILES
       /usr/lib/liblockfile.so.1


AUTHOR
       Miquel van Smoorenburg <miquels at cistron.nl>


SEE ALSO
       dotlockfile(1), maillock(3), touchlock (3), mailunlock(3)



Linux Manpage			 04 June 2004		    LOCKFILE_CREATE(3)


More information about the dovecot mailing list