Re: [Dovecot] Help! Uidlist files are gone and won't come back; imap keeps coredumping!
Timo - The only behavior change that I notice is that when I exit the telnet session the truss also exits. Previously, the truss hung around until I did the "pkill dovecot", after exiting the telnet session. So, that's weird. But, hold on! Now I do seem to have a uidlist file, again! % ls -l total 308 drwx--S--- 2 rvvk40 lowv 98304 Jun 19 07:44 cur/ -rw------- 1 rvvk40 lowv 260 Jun 19 07:47 dovecot-keywords -rw------- 1 rvvk40 lowv 65752 Jun 19 07:47 dovecot-uidlist -rw------- 1 rvvk40 lowv 29560 Jun 11 08:42 dovecot.index drwx--S--- 2 rvvk40 lowv 32768 Jun 19 07:43 new/ -rw------- 1 rvvk40 lowv 1188 May 1 07:01 subscriptions drwx--S--- 2 rvvk40 lowv 61440 Jun 19 07:42 tmp/ Mario Timo Sirainen wrote:
Try if the attached patch changes anything.
------------------------------------------------------------------------
Index: src/lib/file-dotlock.c =================================================================== RCS file: /var/lib/cvs/dovecot/src/lib/file-dotlock.c,v retrieving revision 1.38 diff -u -r1.38 file-dotlock.c --- src/lib/file-dotlock.c 8 Jun 2006 16:42:40 -0000 1.38 +++ src/lib/file-dotlock.c 19 Jun 2006 12:55:51 -0000 @@ -664,6 +664,12 @@ } }
+ if (dotlock->fd != -1) { + if (close(dotlock->fd) < 0) + i_error("close(%s) failed: %m", dotlock->path); + dotlock->fd = -1; + } + if (rename(lock_path, dotlock->path) < 0) { i_error("rename(%s, %s) failed: %m", lock_path, dotlock->path); file_dotlock_free(dotlock);
-- I don't need a name; my number's just fine. | Mario.Nigrovic@freescale.com It's nobody else's -- just mine, all mine. | 480-413-3578 Internal Use Only
On Mon, 2006-06-19 at 08:03 -0700, Mario Nigrovic-rvvk40 wrote:
Timo -
The only behavior change that I notice is that when I exit the telnet session the truss also exits. Previously, the truss hung around until I did the "pkill dovecot", after exiting the telnet session. So, that's weird. But, hold on! Now I do seem to have a uidlist file, again!
I'd say there's something really wrong with your NFS. What that patch did was close the dovecot-uidlist file before rename()ing it. It shouldn't have been necessary. There are probably other places in Dovecot code which also do that, so that's probably why it didn't get fixed entirely.
What Solaris version are you using?
Timo Sirainen wrote:
On Mon, 2006-06-19 at 08:03 -0700, Mario Nigrovic-rvvk40 wrote:
Timo -
The only behavior change that I notice is that when I exit the telnet session the truss also exits. Previously, the truss hung around until I did the "pkill dovecot", after exiting the telnet session. So, that's weird. But, hold on! Now I do seem to have a uidlist file, again!
I'd say there's something really wrong with your NFS. What that patch did was close the dovecot-uidlist file before rename()ing it. It shouldn't have been necessary. There are probably other places in Dovecot code which also do that, so that's probably why it didn't get fixed entirely.
What Solaris version are you using?
SunOS velocity 5.8 Generic_117350-06 sun4u sparc
The NFS server is NetApp Release 6.5.3P4
Here's another NFS weirdness detail:
I've got just a ton of .nfs files all over my .mail subdirectory. I think these are the byproduct of our NFS server's keeping snapshot files around, but I'm not sure. Typical content is like:
% head .nfsED2F8 .nfs80729 .nfs0E528 ==> .nfsED2F8 <== 0 NonJunk
==> .nfs80729 <== 0 unknown-0 1 unknown-1 2 unknown-2 3 unknown-3 4 unknown-4 5 unknown-5 6 unknown-6 7 unknown-7 8 unknown-8 9 unknown-9
==> .nfs0E528 <== 1 1150484344 1887 1 1129102770.584_0.savant:2,S 2 1130279636.12280_0.savant:2,S 3 1130286944.21913_0.savant:2,S 4 1130289887.25264_0.savant:2,S 5 1130311569.19250_0.savant:2,S 6 1130380470.14692_0.savant:2,S 7 1130428927.9300_0.savant:2,S 8 1131476238.26857_0.savant:2,S 9 1131514339.14518_0.savant:2,RS
So, these look like uidlist files, right? Interesting.
Also, I've now seen two messages like the following. Are these usual?
dovecot: Jun 19 09:13:27 Error: IMAP(rvvk40): Maildir /home/mario/.mail/Maildir/.Freescale.ATIC.CMOS65 sync: UIDVALIDITY changed (1150719396 -> 1150728477)
Mario
-- I don't need a name; my number's just fine. | Mario.Nigrovic@freescale.com It's nobody else's -- just mine, all mine. | 480-413-3578 Internal Use Only
On Mon, 2006-06-19 at 10:59 -0700, Mario Nigrovic-rvvk40 wrote:
Here's another NFS weirdness detail:
I've got just a ton of .nfs files all over my .mail subdirectory. I think these are the byproduct of our NFS server's keeping snapshot files around, but I'm not sure. Typical content is like:
Those are created when a file is deleted from the filesystem but it's still opened by the process itself. It should be deleted automatically when the process dies. All of this is done by the kernel.
Also, I've now seen two messages like the following. Are these usual?
dovecot: Jun 19 09:13:27 Error: IMAP(rvvk40): Maildir /home/mario/.mail/Maildir/.Freescale.ATIC.CMOS65 sync: UIDVALIDITY changed (1150719396 -> 1150728477)
These mean that dovecot-uidlist file was temporarily lost and then created by another Dovecot process, while the first process still had the mailbox open.
Since you're using Solaris 8 it seems a bit weird that it would behave this badly with NFS, since Solaris was the one OS where it was supposed to work properly.. But the Dovecot's broken behavior is anyway because of your NFS's broken behavior. I don't think you can get it working properly. If it was working correctly before, did you change something like update the kernel?
Timo -
The problem manifested more-or-less as I upgraded from 0.99 to 1.0; but we also replaced the NFS server (formerly a Sun) with the netapp around the same time. The netapp has snapshots turned on, so it's doing something unusual to preserve files.
Mario
Timo Sirainen wrote:
On Mon, 2006-06-19 at 10:59 -0700, Mario Nigrovic-rvvk40 wrote:
Here's another NFS weirdness detail:
I've got just a ton of .nfs files all over my .mail subdirectory. I think these are the byproduct of our NFS server's keeping snapshot files around, but I'm not sure. Typical content is like:
Those are created when a file is deleted from the filesystem but it's still opened by the process itself. It should be deleted automatically when the process dies. All of this is done by the kernel.
Also, I've now seen two messages like the following. Are these usual?
dovecot: Jun 19 09:13:27 Error: IMAP(rvvk40): Maildir /home/mario/.mail/Maildir/.Freescale.ATIC.CMOS65 sync: UIDVALIDITY changed (1150719396 -> 1150728477)
These mean that dovecot-uidlist file was temporarily lost and then created by another Dovecot process, while the first process still had the mailbox open.
Since you're using Solaris 8 it seems a bit weird that it would behave this badly with NFS, since Solaris was the one OS where it was supposed to work properly.. But the Dovecot's broken behavior is anyway because of your NFS's broken behavior. I don't think you can get it working properly. If it was working correctly before, did you change something like update the kernel?
participants (3)
-
Mario Nigrovic
-
Mario Nigrovic-rvvk40
-
Timo Sirainen