This patch fixed the subscriptions file problem, but there is still a similar issue for some of the index files, eg dovecot.index.cache. I suppose I could put the indices on local storage (I really wanted to keep everything about a user together in their account), but I have to wonder if there are other places where the problem might again arise. Are draft messages manipulated in a similar manner? Down the road I'd be moving to a newer server (my test on a Solaris 10 x86 has hit an assert snag), but right now that's not an option, nor is an O/S upgrade (there's a reason these servers have been up rather more than 2 years..). Apparently all of this is not an issue Solaris 9 (sparc) Generic_118558-34 (as reported by Alex Moore), whereas my earlier version is Generic_117171-08, so the kernel fix lies somewhere between. Thanks, John Harper ------------------------------------------------- Senior Systems Administrator Information and Instructional Technology Services University of Toronto Scarborough harper@utsc.utoronto.ca On Sat, Dec 16, 2006 at 03:31:25AM +0200, Timo Sirainen wrote:
On Fri, 2006-12-15 at 13:08 -0500, John Harper wrote:
I am using dovecot 1.0.rc15 (a similar problem occurred in rc10) on Solaris 9 (sparc). When working with a user who's home dir is on a local disk everything seems fine. But when that home is on an NFS-mounted disk things are very badly awry.
Both the indices and the subscriptions file are being destroyed and what is left behind are files with names of the form .nfs72C034 etc.
I've heard of this before. I think there's a Solaris kernel patch to fix this, but I'm not sure. If you find it, please add a note about it to http://wiki.dovecot.org/NFS
Anyway if I remember correctly, the problem went like this:
1. Dovecot creates a temp.1234 file, link()s it into subscriptions.lock file and unlink()s temp.1234. 2. The subscriptions are written to the lock file, and then Dovecot does rename(subscriptions.lock, subscriptions) 3. The file is close()d
By closing the file before renaming, the problem went away. I think most of these problems could be fixed with a simple patch, but I didn't want to do that change because it probably still breaks with other less obvious things, so it's better to be fully broken.
You could anyway try if this mostly-fixes it:
Index: src/lib/file-dotlock.c =================================================================== RCS file: /var/lib/cvs/dovecot/src/lib/file-dotlock.c,v retrieving revision 1.35.2.3 diff -u -r1.35.2.3 file-dotlock.c --- src/lib/file-dotlock.c 8 Jun 2006 16:13:46 -0000 1.35.2.3 +++ src/lib/file-dotlock.c 16 Dec 2006 01:30:06 -0000 @@ -664,6 +664,12 @@ } }
+ if (dotlock->fd != -1) { + if (close(dotlock->fd) < 0) + i_error("close(%s) failed: %m", dotlock->path); + dotlock->fd = -1; + } + if (rename(lock_path, dotlock->path) < 0) { i_error("rename(%s, %s) failed: %m", lock_path, dotlock->path); file_dotlock_free(dotlock);