[Dovecot] 0.99.10.5 release candidate
Thought I'd still make one 0.99.10.x release now that mbox corruption problems are (hopefully) fixed.
Please test, if no problems are found this will be the final 0.99.10.5. Changes since .4:
+ MySQL authentication, patch by Matthew Reimer
- mbox: APPEND reversed given \Draft and \Deleted flags
- mbox: "LF not found" errors happened sometimes when X-IMAPbase
header was updated. Possibly corrupted mbox sometimes.
Thanks to Fabrice Bellet for finding this bug.
- Custom flags couldn't be unset
- Maildir: make sure ":2," is appended to filename when moving mails
from new/ to cur/.
Hi,
- Timo Sirainen <tss@iki.fi> (20040525 21:57):
Thought I'd still make one 0.99.10.x release now that mbox corruption problems are (hopefully) fixed.
I have yet to see from my testers if mailbox corruption still occurs.
- mbox: "LF not found" errors happened sometimes when X-IMAPbase header was updated. Possibly corrupted mbox sometimes. Thanks to Fabrice Bellet for finding this bug.
I have not seen these errors in the log file with this version (except perhaps the first time people checked their mail after the upgrade). Thank you!
I keep seeing the following message from time to time:
,---- | May 27 16:04:38 munster imap(xxxx): Our dotlock file /home/xxxx/Mail/mrtg.lock was modified (1085666677 vs 1085666678), assuming it wasn't overridden `----
Could it still be a NFS-locking problem?
Dovecot uses: mbox_locks = dotlock
Procmail (on the same server, FreeBSD 5.2) uses: Locking strategies: dotlocking, lockf()
Procmail (on another server, Tru64) uses: Locking strategies: dotlocking, fcntl(), lockf(), flock()
I suppose I must use exactly the same locking facilities, but there seems to be differences between reading and writing a mbox in Dovecot.
-- olive
On 27.5.2004, at 17:11, Olivier Tharan wrote:
Thought I'd still make one 0.99.10.x release now that mbox corruption problems are (hopefully) fixed.
I have yet to see from my testers if mailbox corruption still occurs.
So far I've heard only positive reports :)
I keep seeing the following message from time to time:
,---- | May 27 16:04:38 munster imap(xxxx): Our dotlock file /home/xxxx/Mail/mrtg.lock was modified (1085666677 vs 1085666678), assuming it wasn't overridden `----
Could it still be a NFS-locking problem?
Hmm.. Just one second difference. Is it always one second?
Dovecot uses: mbox_locks = dotlock
If you don't use fcntl locking (and with NFS you probably won't), you'd have to enable mbox_read_dotlock, otherwise mailbox isn't locked for reading at all and that could cause problems. I think I'll make Dovecot complain about this if it's not done..
- Timo Sirainen <tss@iki.fi> (20040527 17:52):
| May 27 16:04:38 munster imap(xxxx): Our dotlock file /home/xxxx/Mail/mrtg.lock was modified (1085666677 vs 1085666678), assuming it wasn't overridden `----
Could it still be a NFS-locking problem?
Hmm.. Just one second difference. Is it always one second?
Oh yes it is. I have never taken the trouble to examine the figures closer. Is it a problem or is it normal?
Dovecot uses: mbox_locks = dotlock
If you don't use fcntl locking (and with NFS you probably won't), you'd have to enable mbox_read_dotlock, otherwise mailbox isn't locked for reading at all and that could cause problems. I think I'll make Dovecot complain about this if it's not done..
mbox_read_dotlock is set, yes.
-- olive
On 27.5.2004, at 18:26, Olivier Tharan wrote:
- Timo Sirainen <tss@iki.fi> (20040527 17:52):
| May 27 16:04:38 munster imap(xxxx): Our dotlock file /home/xxxx/Mail/mrtg.lock was modified (1085666677 vs 1085666678), assuming it wasn't overridden `----
Could it still be a NFS-locking problem?
Hmm.. Just one second difference. Is it always one second?
Oh yes it is. I have never taken the trouble to examine the figures closer. Is it a problem or is it normal?
Would attached patch help? Wasn't a real problem if it does, but then I have to keep in mind that NFS implementations can do this too to avoid real problems elsewhere..
- Timo Sirainen <tss@iki.fi> (20040527 19:01):
Oh yes it is. I have never taken the trouble to examine the figures closer. Is it a problem or is it normal?
Would attached patch help? Wasn't a real problem if it does, but then I have to keep in mind that NFS implementations can do this too to avoid real problems elsewhere..
With 0.99.10.5 and this patch, things seem to have settled to normal and I do not have spurious log messages anymore.
I think I will soon put this server into production. Thanks!
-- olive
Olivier Tharan <olive@pasteur.fr> writes:
Oh yes it is. I have never taken the trouble to examine the figures closer. Is it a problem or is it normal?
I'm running a LAN with two and a half dozen machines, and the time is dead on, within fractions of a second - I'm running NTP, one machine in broadcast mode (and with "upstream" servers, i. e. servers with lower stratum, peut-être que l'Institut Pasteur has one of those stratum 2 servers for internal use?), the others in broadcastclient mode. Little traffic, everything in synch.
-- Matthias Andree
Encrypted mail welcome: my GnuPG key ID is 0x052E7D95
Timo Sirainen <tss@iki.fi> writes:
If you don't use fcntl locking (and with NFS you probably won't), you'd
What does this mean? Of course, we'll use fcntl locking with NFS as well. At least on Linux and Solaris, this works.
-- Matthias Andree
Encrypted mail welcome: my GnuPG key ID is 0x052E7D95
On Fri, 2004-05-28 at 10:21, Matthias Andree wrote:
Timo Sirainen <tss@iki.fi> writes:
If you don't use fcntl locking (and with NFS you probably won't), you'd
What does this mean? Of course, we'll use fcntl locking with NFS as well. At least on Linux and Solaris, this works.
I thought Linux or BSDs didn't support fcntl locks as NFS client? And I've heard they've always been more or less buggy..
I know FreeBSD currently does not (or at least 4.x, and I'm pretty sure neither does 5.x)
Timo Sirainen wrote:
On Fri, 2004-05-28 at 10:21, Matthias Andree wrote:
Timo Sirainen <tss@iki.fi> writes:
If you don't use fcntl locking (and with NFS you probably won't), you'd
What does this mean? Of course, we'll use fcntl locking with NFS as well. At least on Linux and Solaris, this works.
I thought Linux or BSDs didn't support fcntl locks as NFS client? And I've heard they've always been more or less buggy..
-- James L Moser james@powweb.com PowWeb Hosting http://www.powweb.com
/(bb|[^b]{2})/, that is the Question.
mysql>SELECT * FROM user WHERE clue > 0; Empty set (0.03 sec)
Health is merely the slowest possible rate at which one can die... Health nuts are going to feel stupid someday, lying in hospitals dying of nothing...
On Sat, May 29, 2004 at 03:28:29PM -0700, James Moser wrote:
I know FreeBSD currently does not (or at least 4.x, and I'm pretty sure neither does 5.x)
That's my experience with 4.x too; I've not tried 5.x with this but then I don't remember seeing any release notes saying this had changed either.
For POP3 access, I'm happy for no locks to be in place at all. If someone is daft enough to make two concurrent POP3 accesses to the same mailbox, then at worst what happens when they try to retrieve a message is they'll get
-ERR This message has been deleted by someone else!
or something like that. I can live with that.
I've seen too many mailsystems which have annoying mailbox locks; you disconnect and the mailbox remains unavailable for 30 minutes or more because it thinks you are using it.
Dotlocks are a pain when you have multiple frontends on an NFS server, because it's impossible to tell if they're stale (they may contain the information that they were created by server B pid P, but if you're on server A, you can't tell whether process P is still running on B or not)
If it were felt that locking were important, then I'd propose a simple lock server process: a client opens a TCP connection to this process, sends the name of the mailbox it wants to lock, and gets an ACK back. If the client dies then the TCP connection is dropped and the lock is released. I can see issues with numbers of filehandles/sockets on the lock server process itself, so you'd have to tweak kernel parameters on a busy system. Perhaps you could have a pool of lockserver processes listening on different ports, and use a hash of the directory name to work out which one to connect to?
Regards,
Brian.
On Sun, 2004-05-30 at 11:32, Brian Candler wrote:
For POP3 access, I'm happy for no locks to be in place at all. If someone is daft enough to make two concurrent POP3 accesses to the same mailbox, then at worst what happens when they try to retrieve a message is they'll get
-ERR This message has been deleted by someone else!
or something like that. I can live with that.
That's what Dovecot does now.
But that isn't the worst if you have no locks at all. You still want to lock the mailbox while it's being read or written, otherwise there would be either mailbox corruption (two writers at the same time) or user could be sent corrupted mails (one reading while another is writing).
Dotlocks are a pain when you have multiple frontends on an NFS server, because it's impossible to tell if they're stale (they may contain the information that they were created by server B pid P, but if you're on server A, you can't tell whether process P is still running on B or not)
Dovecot does what mutt does. If dotlock exists and mailbox or the dotlock isn't modified in 30 seconds, it's overridden. Hmm. Although that might be a bit too aggressive, just reading a large mailbox could take long and that wouldn't modify it..
If it were felt that locking were important, then I'd propose a simple lock server process: a client opens a TCP connection to this process, sends the name of the mailbox it wants to lock, and gets an ACK back. If the client dies then the TCP connection is dropped and the lock is released.
I think nfs.lockd does pretty much that, except the client OS must notify if process dies without releasing lock. If the whole computer gets lost, .. I guess the lock stays there for a long time?
I can see issues with numbers of filehandles/sockets on the lock server process itself, so you'd have to tweak kernel parameters on a busy system. Perhaps you could have a pool of lockserver processes listening on different ports, and use a hash of the directory name to work out which one to connect to?
Might be useful, but you'd probably want to have redundancy as well as load balancing. Might not be that easy to implement.
On Mon, May 31, 2004 at 07:24:41PM +0300, Timo Sirainen wrote:
-ERR This message has been deleted by someone else!
or something like that. I can live with that.
That's what Dovecot does now.
But that isn't the worst if you have no locks at all. You still want to lock the mailbox while it's being read or written, otherwise there would be either mailbox corruption (two writers at the same time) or user could be sent corrupted mails (one reading while another is writing).
Oh yes, sorry. I now think "Maildir" exclusively, and forget that some people still use mbox :-)
Having seen mailservers where users leave 10-20MB of mail on the server, in a single mbox file, where the POP3 server has to read through the whole mbox every time they login just to count how many messages are in there... well, that's enough for me to give up on mbox.
I think nfs.lockd does pretty much that, except the client OS must notify if process dies without releasing lock. If the whole computer gets lost, .. I guess the lock stays there for a long time?
If it were over TCP, you could turn on keepalives I guess.
Cheers,
Brian.
Brian wrote:
On Mon, May 31, 2004 at 07:24:41PM +0300, Timo Sirainen wrote:
-ERR This message has been deleted by someone else!
or something like that. I can live with that.
That's what Dovecot does now.
But that isn't the worst if you have no locks at all. You still want to lock the mailbox while it's being read or written, otherwise there would be either mailbox corruption (two writers at the same time) or user could be sent corrupted mails (one reading while another is writing).
Oh yes, sorry. I now think "Maildir" exclusively, and forget that some people still use mbox :-)
Converting tens of thousands of users (in a fast and transparent as possible fashion) is no easy feat.
Having seen mailservers where users leave 10-20MB of mail on the server, in a single mbox file, where the POP3 server has to read through the whole mbox every time they login just to count how many messages are in there... well, that's enough for me to give up on mbox.
Tell me about it. ;P
Note though that maildir still leaves you vulnerable to some extent with large individual mails (we allow for 10MB per message max here). The atomic tmp -> new move prevents of course the case of multiple (inbound, SMTP) writers. Alas following a recent conversation in this ML with Timo all that saves your butt and mail integrity in the case when a large message being read by slow client gets deleted by another client (dovecot instance) is Saint BufferCache. ;)
Regards,
Christian
Christian Balzer Network/Systems Engineer NOC chibi@gol.com Global OnLine Japan/Fusion Network Services http://www.gol.com/
On Tue, Jun 01, 2004 at 01:31:09PM +0900, Christian Balzer wrote:
Oh yes, sorry. I now think "Maildir" exclusively, and forget that some people still use mbox :-)
Converting tens of thousands of users (in a fast and transparent as possible fashion) is no easy feat.
It's not hard in principle.
courier-imap has a feature called "loginexec" (actually something I wrote and managed to persuade Sam to include).
Whenever you login, if a file called "loginexec" exists within the Maildir and is executable, it is run. If it terminates with a zero exit code then it is deleted.
So in your case, within each Maildir you put a small loginexec script which performs the conversion from mbox to Maildir. The next time this user logs in, their mail gets converted, and then the loginexec file is deleted.
What I was using loginexec for was to transfer mail from a remote POP3 server; I wrote a small C program which pulled mail from the old server (given the hostname, username and password) and dropped it into the local Maildir. I've successfully migrated hundreds of thousands of mailboxes in this way, where I don't even have shell access to the old POP3 server.
Note though that maildir still leaves you vulnerable to some extent with large individual mails (we allow for 10MB per message max here).
Sorry, I don't follow. There's no need for the pop3/imap server to *read* every message each time a user logs in. The size for LIST can be cached, so you'd only actually open the file when the client requests the message content or headers.
The atomic tmp -> new move prevents of course the case of multiple (inbound, SMTP) writers. Alas following a recent conversation in this ML with Timo all that saves your butt and mail integrity in the case when a large message being read by slow client gets deleted by another client (dovecot instance) is Saint BufferCache. ;)
I don't understand that comment either. If process A has a file open, and process B deletes it, the file remains (in its entirety) on the filesystem until process A closes it. That's not buffer caching; that's the semantics of unlink().
Regards,
Brian.
On Tue, 2004-06-01 at 11:00, Brian Candler wrote:
The atomic tmp -> new move prevents of course the case of multiple (inbound, SMTP) writers. Alas following a recent conversation in this ML with Timo all that saves your butt and mail integrity in the case when a large message being read by slow client gets deleted by another client (dovecot instance) is Saint BufferCache. ;)
I don't understand that comment either. If process A has a file open, and process B deletes it, the file remains (in its entirety) on the filesystem until process A closes it. That's not buffer caching; that's the semantics of unlink().
I think Christian was talking about NFS, it doesn't follow the "semantics of unlink()". Rather if a file is deleted and it's tried to be read later you'll get ESTALE.
On Tue, Jun 01, 2004 at 03:15:34PM +0300, Timo Sirainen wrote:
I don't understand that comment either. If process A has a file open, and process B deletes it, the file remains (in its entirety) on the filesystem until process A closes it. That's not buffer caching; that's the semantics of unlink().
I think Christian was talking about NFS, it doesn't follow the "semantics of unlink()". Rather if a file is deleted and it's tried to be read later you'll get ESTALE.
Bleurgh. Thanks, I stand corrected (and see one of the reasons why NFS is considered as nasty)
In that case, if this happens while a message is being downloaded, all the POP3 server can do is drop the TCP connection, to prevent the client getting a partial message.
Cheers,
Brian.
Brian Candler <B.Candler@pobox.com> writes:
In that case, if this happens while a message is being downloaded, all the POP3 server can do is drop the TCP connection, to prevent the client getting a partial message.
I'll take bets as to what client recognizes EOF as "don't display/store this message" O:-)
-- Matthias Andree
Encrypted mail welcome: my GnuPG key ID is 0x052E7D95
On Tue, 2004-06-01 at 18:10, Matthias Andree wrote:
Brian Candler <B.Candler@pobox.com> writes:
In that case, if this happens while a message is being downloaded, all the POP3 server can do is drop the TCP connection, to prevent the client getting a partial message.
I'll take bets as to what client recognizes EOF as "don't display/store this message" O:-)
Actually I think most POP3 clients would handle this correctly. Dying TCP connections are quite common with dialups.
Timo Sirainen <tss@iki.fi> writes:
Actually I think most POP3 clients would handle this correctly. Dying TCP connections are quite common with dialups.
Ah well. Might be I am just lucky, but I have never been plagued by dying connections on dialups. Slow maybe, but not dying.
-- Matthias Andree
Encrypted mail welcome: my GnuPG key ID is 0x052E7D95
On 2.6.2004, at 00:15, Matthias Andree wrote:
Timo Sirainen <tss@iki.fi> writes:
Actually I think most POP3 clients would handle this correctly. Dying TCP connections are quite common with dialups.
Ah well. Might be I am just lucky, but I have never been plagued by dying connections on dialups. Slow maybe, but not dying.
Do you exit your POP3 client every time before disconnecting? ;)
Anyway, I think it's common enough event that POP3 client authors have figured out that it might happen and don't permanently store the partial message.
Timo Sirainen <tss@iki.fi> writes:
Do you exit your POP3 client every time before disconnecting? ;)
The only POP3 client I know that keeps the connection open is mutt.
-- Matthias Andree
Encrypted mail welcome: my GnuPG key ID is 0x052E7D95
Timo Sirainen wrote:
On Sun, 2004-05-30 at 11:32, Brian Candler wrote:
I can see issues with numbers of filehandles/sockets on the lock server process itself, so you'd have to tweak kernel parameters on a busy system. Perhaps you could have a pool of lockserver processes listening on different ports, and use a hash of the directory name to work out which one to connect to?
Might be useful, but you'd probably want to have redundancy as well as load balancing. Might not be that easy to implement.
How about distributed lock manager which sits on every node? It could communicate with other nodes by multicast, although keeping the lock would require heartbeat packets instead of just keeping TCP session open, thus causing more interrupts and context switches :-(
-- Tomi Hakala
On Mon, May 31, 2004 at 09:25:53PM +0300, Tomi Hakala wrote:
Timo Sirainen wrote:
On Sun, 2004-05-30 at 11:32, Brian Candler wrote:
I can see issues with numbers of filehandles/sockets on the lock server process itself, so you'd have to tweak kernel parameters on a busy system. Perhaps you could have a pool of lockserver processes listening on different ports, and use a hash of the directory name to work out which one to connect to?
Might be useful, but you'd probably want to have redundancy as well as load balancing. Might not be that easy to implement.
How about distributed lock manager which sits on every node? It could communicate with other nodes by multicast, although keeping the lock would require heartbeat packets instead of just keeping TCP session open, thus causing more interrupts and context switches :-(
This could be done via Spread: www.spread.org. Spread is an extended virtual synchrony toolkit (roughly this means you have an all-or-nothing delivery guarantee). I maintain FreeBSD and Debian-style packages of Spread and find it very useful for distributed applications. It's not written in Dovecot's secure style, however.
J
-- Joshua Goodall "as modern as tomorrow afternoon" joshua@roughtrade.net - FW109
Timo Sirainen <tss@iki.fi> writes:
On Fri, 2004-05-28 at 10:21, Matthias Andree wrote:
Timo Sirainen <tss@iki.fi> writes:
If you don't use fcntl locking (and with NFS you probably won't), you'd
What does this mean? Of course, we'll use fcntl locking with NFS as well. At least on Linux and Solaris, this works.
I thought Linux or BSDs didn't support fcntl locks as NFS client? And I've heard they've always been more or less buggy..
FreeBSD 4 supported NFS locks only on the server side, but not on the client side. FreeBSD 5 is supposed to support client-side locks as well, but I haven't tried.
Linux has been fine for a long time now (I believe that with ext2/ext3, NFS has been doing fine since 2.2.12 or so, for other file systems, XFS and reiserfs in particular, NFS was a problem until late into 2.4).
The point is that NFS locking requires fcntl() and will not work with flock().
-- Matthias Andree
Encrypted mail welcome: my GnuPG key ID is 0x052E7D95
According to Matthias Andree:
Timo Sirainen <tss@iki.fi> writes:
I thought Linux or BSDs didn't support fcntl locks as NFS client? And I've heard they've always been more or less buggy..
Linux has been fine for a long time now (I believe that with ext2/ext3, NFS has been doing fine since 2.2.12 or so, for other file systems, XFS and reiserfs in particular, NFS was a problem until late into 2.4).
Linux as a client has supported NFS locking for years. As a server, you need to use the kernel NFS server, not the user-level NFS server, to get working NFS locking. That has been stable since 2.4.something.
I've been running lots of linux nfs servers and clients in NFSv3 mode with locking enabled, all 2.4 kernels, for one or two years now. No problems.
But indeed, you need to use fcntl() locking on Linux to get NFS locking. lockf() is implemented seperately and doesn't work over NFS. You can use flock(), since that is just an interface to fcntl(), but that is Linux specific - on other platforms flock() might not be an interface to fcntl().
(BTW, it sucks that lockf() doesn't work over NFS - the POSIX fcntl() locking semantics are stupid. Closing any other fd in the same process that refers to the same file loses all the locks ? Wtf ? )
Mike.
participants (9)
-
Brian Candler
-
Christian Balzer
-
James Moser
-
Joshua Goodall
-
Matthias Andree
-
Miquel van Smoorenburg
-
Olivier Tharan
-
Timo Sirainen
-
Tomi Hakala