[dovecot] Re: [bincimap] Re: Re: bincimap

newer
[dovecot] mbox vs maildir debate...

Ian R. Justman

17 Feb 2003 17 Feb '03

12:14 p.m.

On Mon, 17 Feb 2003, Andreas Aardal Hanssen wrote:

...
...
It even recently included a POP server. What's the reasoning there? Someone wanted it so it'd be easy to run both POP3 and IMAP servers without having to configure them twice. I don't see any harm in it anyway, it took only few hours to write, it's optional and doesn't take much space in sources.

Here's where I would say - there are hundreds of working POP3 servers around, both in closed and open source, so adding a POP3 server to the Dovecot project just introduces more lines of code where bugs may appear ;).

Here is what -I- would say:

I use mbox on all my servers which I have absolutely no intention of changing despite the issues concerning file locking inherent in the whole mbox format. So please, --NO-- attempts to convince me to go to maildir.

My take is that if I am going to use an IMAP server, it would be VERY nice if a POP3 server also came bundled with it. Dovecot's having both IMAP and POP3 servers is great because they will likely use the same file-locking schemes (in fact, probably even share the same locking settings).

That way if I inadvertently start up a POP3 session while I am connected via IMAP, if they use identical locking mechanisms, I don't screw up my mailboxes should I inadvertently tell my system that I want to delete the messages from my mailbox.

Basically, if you open via IMAP using a server with a diffent locking scheme while your hung POP3 session is still running AND you modify your mailbox, you are screwed. Especially if that mailstore is NFS-shared, which is always a dicey proposition.

Additionally, the number of mbox POP3 servers which Do Not Suck(R) is rather low right now. Timo finally introduced an mbox IMAP server which definitely Does Not Suck, notably it allows for multiple clients accessing the same mbox mailstore, something previously only offered by Cyrus and perhaps maildir ("perhaps" only because I do not use maildir).

Not to mention, quite frankly, most mbox POP3 implementations suck pretty badly anyway. :P

--Ian.

Show replies by date

Timo Sirainen

17 Feb 17 Feb

7:34 p.m.

On Mon, 2003-02-17 at 12:14, Ian R. Justman wrote:

...

Additionally, the number of mbox POP3 servers which Do Not Suck(R) is rather low right now. Timo finally introduced an mbox IMAP server which definitely Does Not Suck, notably it allows for multiple clients accessing the same mbox mailstore, something previously only offered by Cyrus and perhaps maildir ("perhaps" only because I do not use maildir).

I don't actually understand why UW imapd doesn't allow it. It needs to be able to deal with unexpected mbox changes anyway, so why allow only one IMAP session for it? Or maybe it deals with unexpected changes just by killing the IMAP connection? I'd have to try some day.

Few days ago I was also wondering how UW imapd even knows how to kill the older connection. It seems that it writes PID to /tmp/.<device>.<inode>. When another IMAP process tries to access the same mbox it sees the file, sends SIGUSR2 to the PID inside it and waits for a while to see if the lock file goes away. Strange.

Andreas Aardal Hanssen

8:07 p.m.

Hi, Ian. Cross-posting discussions like this is usually not appreciated by those who subscribe to both foras, so I'll limit it to the Dovecot list.

On Mon, 17 Feb 2003, Ian R. Justman wrote:

...

On Mon, 17 Feb 2003, Andreas Aardal Hanssen wrote: My take is that if I am going to use an IMAP server, it would be VERY nice if a POP3 server also came bundled with it. Dovecot's having both IMAP and POP3 servers is great because they will likely use the same file-locking schemes (in fact, probably even share the same locking settings).

I agree with you.

But any shared mailbox format that makes assumptions on the type of locking used by other accessors, and does not provide a standard locking mechanism bundled with the format specification, is b0rken. But nobody ever claimed that mbox was anything else, did they? ;) mbox is yesterday's format, and the only reason people cling to it is for convenience and "if it works, don't fix it".

Today we can exploit concurrency and rapid access in ways that were unthinkable back when mbox was designed. It's only natural that the old and worn fall by the swords of the young and strong. :-)

...

Basically, if you open via IMAP using a server with a diffent locking Additionally, the number of mbox POP3 servers which Do Not Suck(R) is rather low right now. Timo finally introduced an mbox IMAP server which

This I don't agree with. Most existing POP3 servers are quite ok, and one POP3 Maildir server that is excellent and bug free since 1998, qmail-pop3d, is quite a piece of art.

...

Not to mention, quite frankly, most mbox POP3 implementations suck pretty badly anyway. :P

This I do agree with. Don't confuse POP3 servers in general with mbox POP3 servers. The root, the source of the evil is the storage format mbox.

Now I will not claim that Maildir is indefinitely much better, but it's almost, but not quite, much much better than mbox.

With Maildir there's no need to lock the depository when deleting or delivering mails (even on NFS), but you can't store (append) a message with a timestamp nor with flags without breaking consistency, and servers have to search for lost messages when an external client changes a flag.

With mbox you have to rewrite the entire mailbox after expunging message #1, and exclusive access is required when doing so. With no indexes, mboxes also need to be more or less parsed on every login - I understand that Dovecot has done some smart stuff here, but that's a workaround for one of the big headaches of a crappy storage format.

Nobody with their wits intact would come up with something as pathetic as the mbox storage format in 2003. That's _my_ personal opinion on this matter.

Now Binc IMAP does not support this format, but that's not because I don't want it to. Rather, I'd like to investigate neat ways to make it work just like Timo has with Dovecot. But rather than breathing fresh air into the wrinkled nostrels of one of the Internet age's uglier artifacts, I'd like to find a way to move all users away from mbox and into a new mailbox format. One box to rule them all. ;)

btw, Flames are happily accepted. :-)

Andy

-- Andreas Aardal Hanssen http://www.andreas.hanssen.name/gpg

seth vidal

8:16 p.m.

...

I agree with you.

But any shared mailbox format that makes assumptions on the type of locking used by other accessors, and does not provide a standard locking mechanism bundled with the format specification, is b0rken. But nobody ever claimed that mbox was anything else, did they? ;) mbox is yesterday's format, and the only reason people cling to it is for convenience and "if it works, don't fix it".

Actually, there are a number of reasons why I still have mbox format. Most of them are legacy. But keep in mind this is 10yrs of legacy in the same format. That kind of inertia takes effort to break free of and in some cases there are political situations which make breaking free just about impossible. But thanks for only attributing laziness to those of us who need to use mbox.

The reason I want an imap and pop server than can handle both mbox and maildir is so I can gradually migrate my users over.

I can take them in batches and move their mail spools to maildir w/o massively disrupting their normal activities.

200-400 users at 2-4 a day will take me 1/3 of a year.

Doing them all in one day will take me a lifetime of cleaning up problems and also trying to explain to those above me why I put all of our users through it.

...

Nobody with their wits intact would come up with something as pathetic as the mbox storage format in 2003. That's _my_ personal opinion on this matter.

We're not disputing that. But mbox is going to HAVE to be there for migration, otherwise you'll not be able to get the users moved over.

-sv

Timo Sirainen

8:39 p.m.

On Mon, 2003-02-17 at 20:07, Andreas Aardal Hanssen wrote:

...

Now I will not claim that Maildir is indefinitely much better, but it's almost, but not quite, much much better than mbox.

The main difference is if there should be one file per message or one file per mailbox. Per-mailbox files are faster (less syscalls, less filesystem stress) as long as mails aren't being expunged from the middle of it. I'm still personally using mbox and I know I usually delete only mails that I've received recently so only small parts of the file needs to be rewritten.

So which one is faster depends mostly on the user. I'm not sure about "better" argument. mbox needs a more work to make it work well. UW imapd supports also another flat file format "mbx" which should be more IMAP-friendly.

UW imapd author also says that mbox is slow and changing to mbx would give a lot higher performance, but I think that's mostly commenting UW imapd implementation than mbox format itself. There are some ugly slowing hacks that have to be done, but they're not _that_ slow if implemented well. And that slowness shows just in CPU usage which is cheap in IMAP servers compared to I/O.

...

With Maildir there's no need to lock the depository when deleting or delivering mails (even on NFS), but you can't store (append) a message with a timestamp nor with flags without breaking consistency,

I don't understand this. What's the problem with setting timestamp or flags when appending a message? You create it in tmp/ with wanted name and timestamp, then rename() it.

...

and servers have to search for lost messages when an external client changes a flag.

Yes, this is annoying.

Andreas Aardal Hanssen

8:59 p.m.

On 17 Feb 2003, Timo Sirainen wrote:

...

On Mon, 2003-02-17 at 20:07, Andreas Aardal Hanssen wrote:

...
With Maildir there's no need to lock the depository when deleting or delivering mails (even on NFS), but you can't store (append) a message with a timestamp nor with flags without breaking consistency, I don't understand this. What's the problem with setting timestamp or flags when appending a message? You create it in tmp/ with wanted name and timestamp, then rename() it.

You use the rename system call for this?

Andreas Aardal Hanssen

9:03 p.m.

On Mon, 17 Feb 2003, Andreas Aardal Hanssen wrote:

...

On 17 Feb 2003, Timo Sirainen wrote:

...
On Mon, 2003-02-17 at 20:07, Andreas Aardal Hanssen wrote: From maildir-storage.c, line 325: while (rename(src, dest) < 0 && count < 2) { if (errno != EEXIST) { rename can never return EEXIST in errno. maildir-save.c, line 124.

I'd like to stress here that rename can't return EEXIST when moving files.

Andy

-- Andreas Aardal Hanssen http://www.andreas.hanssen.name/gpg

Timo Sirainen

9:39 p.m.

On Mon, 2003-02-17 at 20:59, Andreas Aardal Hanssen wrote:

...

...
From maildir-storage.c, line 325:
    while (rename(src, dest) &lt; 0 && count &lt; 2) {
            if (errno != EEXIST) {
rename can never return EEXIST in errno.

Sure it can:

   ENOTEMPTY or EEXIST
          newpath  is  a non-empty directory, i.e., contains entries other
          than "." and "..".

But looks like I've missed the ENOTEMPTY check there, adding. I'm renaming directories there, not files.

...

    /* move the file into new/ directory - syncing will pick it
       up from there */
    if (rename(tmp_path, new_path) == 0)
            failed = FALSE;
Here you can lose emails if the new/ folder contains a message whose base name is equal.

In theory, yes. In practice, I'd say not. It can only happen with broken MUAs, are there any? User could of course deliberately break it, but is there some gain in it?

...

The only way to avoid this is:

use link and unlink, not rename

..and if it crashes (or loses NFS link) between those calls, you'll suddenly see two mails. I prefer atomic operations.

...

Anyway, Maildir has a strict consistency criteria which says that all messages that are linked from tmp/ into new/ _must_ use time(NULL) plus a number that is guaranteed not to lapse within one second, and which does not collide with other messages in new/. new/ is the single entry point into cur/, and messages in new/ can not be "older" than the messages in cur/.

It only has requirement that the file name is unique. There's other ways to do that than time(NULL). http://cr.yp.to/proto/maildir.html lists some.

...

leave messages that have a time_t part equal to or higher than time(NULL). .. The simple reasoning for this is that you can never guarantee that there is no message in cur/ that has the same base name, but perhaps different flags. It follows that when moving a message from new/ to cur/, it is required that the server only picks messages that are older than one second.

I'm not sure what you mean by this.. What's special in files that were just created? What changes after it's older than one second? Or are you changing the base name when moving it to cur/? (What makes you sure that the new base name still doesn't exist?)

I don't think you should rely on checking timestamps in any case. What if the maildir is accessed via NFS and some other computer with different time created the file?

But about the flags, looks like maildir spec says they could be used only in cur/ directory, so I guess my code is broken because it allows setting them already in new/.. Well, I'll fix it anyway later by moving mails directly from tmp/ into cur/ when implementing UIDPLUS extension.

Timo Sirainen

9:48 p.m.

On Mon, 2003-02-17 at 21:39, Timo Sirainen wrote:

...

...
    /* move the file into new/ directory - syncing will pick it
       up from there */
    if (rename(tmp_path, new_path) == 0)
            failed = FALSE;
Here you can lose emails if the new/ folder contains a message whose base name is equal.
In theory, yes. In practice, I'd say not.

Courier and qmail-pop3d also use rename() instead of link()+unlink(). I'd say it's safe enough then.

Andreas Aardal Hanssen

10:21 p.m.

On 17 Feb 2003, Timo Sirainen wrote:

...

On Mon, 2003-02-17 at 21:39, Timo Sirainen wrote:

...
...
    /* move the file into new/ directory - syncing will pick it
       up from there */
    if (rename(tmp_path, new_path) == 0)
            failed = FALSE;
Here you can lose emails if the new/ folder contains a message whose base name is equal.
In theory, yes. In practice, I'd say not. Courier and qmail-pop3d also use rename() instead of link()+unlink(). I'd say it's safe enough then.

Quoting a well known friend of ours, Mark Crispin - Two wrongs doesn't make one right.

qmail-pop3d doesn't move messages from tmp/ to new/. It only moves messages from new/ to cur/. Messages _can_ get lost there, and it seems like Bernstein accepts this.

If a message is lost in this operation, it basically means that a broken server/client has moved a message from new to cur earlier, without taking the one second into consideration.

So where is the bug - qmail-pop3d or Dovecot? I'd say both. First and foremost the server that placed the original message in cur/ in the first place, in breach of Maildir, and qmail-pop3d for renaming that message instead of linking it.

Andy

-- Andreas Aardal Hanssen http://www.andreas.hanssen.name/gpg

Andreas Aardal Hanssen

10:07 p.m.

On 17 Feb 2003, Timo Sirainen wrote:

...

On Mon, 2003-02-17 at 20:59, Andreas Aardal Hanssen wrote: But looks like I've missed the ENOTEMPTY check there, adding. I'm renaming directories there, not files.

My bad.

...

...
    /* move the file into new/ directory - syncing will pick it
       up from there */
    if (rename(tmp_path, new_path) == 0)
            failed = FALSE;
Here you can lose emails if the new/ folder contains a message whose base name is equal.
In theory, yes. In practice, I'd say not. It can only happen with broken MUAs, are there any? User could of course deliberately break it, but is there some gain in it?

It seems the qmail community disagrees with you here - I followed the discussion on the qmail mailing list. I guess the Dovecot community will be knocking on your door when important emails start disappearing unprovoked, and Dovecot is the service that deleted them.

The bottom line is that you can not rely on other Maildir clients following your server's conventions. They will only oblige to the Maildir standard, and using Dovecot, emails will disappear. But there's no sweat - it's a simple fix.

...

...
The only way to avoid this is:

use link and unlink, not rename ..and if it crashes (or loses NFS link) between those calls, you'll suddenly see two mails. I prefer atomic operations.

Well, then Dovecot has to have a way to communicate this to the sysadmin as an error condition. It's indefinitely better than deleting innocent (blog) users' emails.

...

...
Anyway, Maildir has a strict consistency criteria which says that all messages that are linked from tmp/ into new/ _must_ use time(NULL) plus a number that is guaranteed not to lapse within one second, and which does not collide with other messages in new/. new/ is the single entry point into cur/, and messages in new/ can not be "older" than the messages in cur/. It only has requirement that the file name is unique. There's other ways to do that than time(NULL). http://cr.yp.to/proto/maildir.html lists some.

"A unique name has three pieces, separated by dots. On the left is the result of time() or the second counter from gettimeofday(). On the right is the result of gethostname(). (To deal with invalid host names, replace / with \057 and : with \072.) In the middle is a delivery identifier, discussed below."

Quite clear.

...

...
flags. It follows that when moving a message from new/ to cur/, it is required that the server only picks messages that are older than one second. I'm not sure what you mean by this.. What's special in files that were just created? What changes after it's older than one second? Or are you changing the base name when moving it to cur/? (What makes you sure that the new base name still doesn't exist?)

I create a message in tmp and link this to new, then unlink the original.
I create a new message in tmp. I'm about to link this message to new/, when suddenly
You move my original message from new/ to cur/.
I now complete my link from tmp/ to new/.
You move my new message from new/ to cur.

Boom. My original message is gone. What's the point of only moving messages from new/ older than one second?

All messages delivered to new/ in the same second will get a new middle-part name, sometimes with a _0, _1 ... _n next to the pid. When you move the messages from new/ to cur/ after one second, all new messages delivered to new/ will have a different time part. No collisions. But if you move the messages _before_ one second has lapsed, then a message in cur/ may be overwritten when you either

move the message from new/ to cur/
change the flags of two messages whose base is equal.

Using link avoids the loss of emails, and allows the sysadmin to clean up the server's buggy behavior. rename means bye-bye.

...

I don't think you should rely on checking timestamps in any case. What if the maildir is accessed via NFS and some other computer with different time created the file?

Maildir relies on timestamps - like it or not.

...

But about the flags, looks like maildir spec says they could be used only in cur/ directory, so I guess my code is broken because it allows setting them already in new/.. Well, I'll fix it anyway later by moving mails directly from tmp/ into cur/ when implementing UIDPLUS extension.

You will lose emails if you move messages from tmp to cur directly. Simply because you have no idea wether or not messages in cur have the same base as your message in tmp.

Andy

-- Andreas Aardal Hanssen http://www.andreas.hanssen.name/gpg

Timo Sirainen

11:17 p.m.

On Mon, 2003-02-17 at 22:07, Andreas Aardal Hanssen wrote:

...

It seems the qmail community disagrees with you here - I followed the discussion on the qmail mailing list.

Do you have URL or something? I didn't find it.

...

...
...

use link and unlink, not rename ..and if it crashes (or loses NFS link) between those calls, you'll suddenly see two mails. I prefer atomic operations.

Well, then Dovecot has to have a way to communicate this to the sysadmin as an error condition. It's indefinitely better than deleting innocent (blog) users' emails.

Reporting lost NFS link is possible, but what if the whole system crashes in the middle?

...

...
It only has requirement that the file name is unique. There's other ways to do that than time(NULL). http://cr.yp.to/proto/maildir.html lists some.

"A unique name has three pieces, separated by dots. On the left is the result of time() or the second counter from gettimeofday().

Yeah, I should actually read things instead of skipping to interesting part :)

...

All messages delivered to new/ in the same second will get a new middle-part name, sometimes with a _0, _1 ... _n next to the pid. When you move the messages from new/ to cur/ after one second, all new messages delivered to new/ will have a different time part. No collisions. But if you move the messages _before_ one second has lapsed, then a message in cur/ may be overwritten when you either

Waiting just has several annoying problems. Either you really wait for one second (ugh), ignore the file (complicates or breaks some new mail checks) or handle it just as if it were in cur/ already (complicates again, you'd have to check if the missing file was actually in new/).

...

Using link avoids the loss of emails, and allows the sysadmin to clean up the server's buggy behavior. rename means bye-bye.

When moving from tmp/ to new/, I agree that I should have used link() (fixed in CVS) since there's no harm if the unlink() in tmp didn't occur.

Moving from new/ to cur/ or renaming within cur/ is more tricky though. MUAs could see both of the mails at the same time which could lead to weird problems.

...

...
I don't think you should rely on checking timestamps in any case. What if the maildir is accessed via NFS and some other computer with different time created the file?

Maildir relies on timestamps - like it or not.

Only system-wide timestamps since the filename contains hostname part. Nothing should break by itself if two systems have completely different times.

Andreas Aardal Hanssen

11:31 p.m.

On 17 Feb 2003, Timo Sirainen wrote:

...

On Mon, 2003-02-17 at 22:07, Andreas Aardal Hanssen wrote:

...
It seems the qmail community disagrees with you here - I followed the discussion on the qmail mailing list. Do you have URL or something? I didn't find it.

Starts here:

http://marc.theaimsgroup.com/?l=qmail&m=104250383122015&w=2

Sam of Courier-IMAP claimed that OpenBSD's random pid assignment broke qmail and Courier-IMAP. It certainly broke Courier-IMAP.

...

...
...
...

use link and unlink, not rename ..and if it crashes (or loses NFS link) between those calls, you'll suddenly see two mails. I prefer atomic operations. Well, then Dovecot has to have a way to communicate this to the sysadmin as an error condition. It's indefinitely better than deleting innocent (blog) users' emails. Reporting lost NFS link is possible, but what if the whole system crashes in the middle?

Then you have either one A link, two links A and B, or one link B. If you have A and B, the Maildir clients should detect this and report it.

...

...
All messages delivered to new/ in the same second will get a new middle-part name, sometimes with a _0, _1 ... _n next to the pid. When you move the messages from new/ to cur/ after one second, all new messages delivered to new/ will have a different time part. No collisions. But if you move the messages _before_ one second has lapsed, then a message in cur/ may be overwritten when you either Waiting just has several annoying problems. Either you really wait for one second (ugh), ignore the file (complicates or breaks some new mail checks) or handle it just as if it were in cur/ already (complicates again, you'd have to check if the missing file was actually in new/).

Yes! :-) Waiting one second is quiet intolerable. "Ignoring" the message until it's one second old is the only way that I've found that works. That means that you need to make some precautions with APPEND to currently selected mailbox, but I'm not sure how this works in Dovecot. :-/

...

...
Using link avoids the loss of emails, and allows the sysadmin to clean up the server's buggy behavior. rename means bye-bye. When moving from tmp/ to new/, I agree that I should have used link() (fixed in CVS) since there's no harm if the unlink() in tmp didn't occur. Moving from new/ to cur/ or renaming within cur/ is more tricky though. MUAs could see both of the mails at the same time which could lead to weird problems.

Yes - if both mails are present, the client needs to 1) do something smart or 2) ignore the problem and let the sysadmin do something smart.

...

...
...
I don't think you should rely on checking timestamps in any case. What if the maildir is accessed via NFS and some other computer with different time created the file? Maildir relies on timestamps - like it or not. Only system-wide timestamps since the filename contains hostname part. Nothing should break by itself if two systems have completely different times.

True.. but I wish there was a better mailbox storage format out there without little buggers like this messing up the simple design.

Andy

-- Andreas Aardal Hanssen http://www.andreas.hanssen.name/gpg

Timo Sirainen

18 Feb 18 Feb

12:29 a.m.

On Mon, 2003-02-17 at 23:31, Andreas Aardal Hanssen wrote:

...

...
Do you have URL or something? I didn't find it.

Starts here:

http://marc.theaimsgroup.com/?l=qmail&m=104250383122015&w=2

Sam of Courier-IMAP claimed that OpenBSD's random pid assignment broke qmail and Courier-IMAP. It certainly broke Courier-IMAP.

Well, that helped some, except most of the suggested "fixes" didn't really help.

I can think of one clean solution for this: Make sure the base filename is unique by using inode and/or making sure that process trying to APPEND has existed at least for a second and will exist for at least the next second.

That still doesn't help if someone else screwed up and created multiple identical base names, but I'd really rather not use link+unlink. I think I will anyway make my maildir_sync() to check for basename conflicts and fix them.

...

...
Moving from new/ to cur/ or renaming within cur/ is more tricky though. MUAs could see both of the mails at the same time which could lead to weird problems.

Yes - if both mails are present, the client needs to 1) do something smart or 2) ignore the problem and let the sysadmin do something smart.

I don't mean that as any failure condition. That could happen in everyday mail usage if there's multiple clients accessing the same mailbox. One client scanning the maildir at the same time as another client is updating mail flag. The same mail could show up twice with readdir() even if the other one is soon after unlink()ed.

8197

Age (days ago)

8197

Last active (days ago)

List overview

13 comments

4 participants

participants (4)

Andreas Aardal Hanssen
Ian R. Justman
seth vidal
Timo Sirainen