[Dovecot] Maildir unreliability
Geo Carncross
geocar-dovecot at internetconnection.net
Tue Oct 26 02:35:36 EEST 2004
On Mon, 2004-10-25 at 16:32, Timo Sirainen wrote:
> On 25.10.2004, at 23:11, Geo Carncross wrote:
>
> > This is nonsense. The problem is that the behavior of readdir() is
> > confusing.
> >
> > Why should unlink() or rename() invalidate data that your C library
> > ALREADY READ from the directory?
>
> Why do you think it was already read? It wasn't. That's the problem. An
> existing renamed file may never be returned by one opendir() ..
> readdir() .. closedir() loop.
Because strace says so. If you use getdents() directly, the problem
magnifies significantly [see below].
> > if (stat(d->d_name, &sb) == -1)continue;
> >
> > After your check for the "." in the first character of the d->d_name
> > (about line 41) and all will be good. No amount of twiddling with
> > USE_UNLINK or FILES is going to affect it.
>
> Right. Because the stat() always fails so the whole thing does nothing.
> If you actually do the correct check:
>
> sprintf(path, PATH"/%s", d->d_name);
> if (stat(path, &sb) < 0)
> continue;
>
> Then it's just as broken as before, but works more slowly.
Bah. I typed it correctly on my end :)
My problem here is I got lucky three times in a row, and misread your
post.
> > Of course. For readdir() to be atomic, it would need to do a system
> > call
> > for each directory entry. This is exactly why readdir() doesn't, so
> > that
> > you do one syscall for every (say) 50 entries, and if you want
> > validity,
> > you'll do a stat() yourself.
>
> I don't have a problem with readdir() returning a file that doesn't
> exist anymore. I have a problem of readdir() not returning an existing
> file. The exact opposite.
But it _doesn't_ exist. the opendir() gets a file descriptor- we don't
get to the old-data yet, but the new-data isn't necessarily put ahead of
our current offset (it isn't actually put anyplace reachable...).
I think I understand better what the problem is:
If you don't attempt to detect it internally (and instead just puts the
file number- :.* stripped off) you'll see:
./readdir | sort -bn | uniq -c | sort -nr
produces entries LESS THAN 11- which you're hoping it wouldn't.
The only ways the operating system could do this are:
1. be notified of name changes ([DI]notify - what courier does)
2. change the semantics directories in the kernel such that NEW NAMES
always appear at the end of the directory.
#2 would be awful hard, but #1 could be handled right here, although it
wouldn't be portable [then, neither would #2, but read on...]
> > Now: Maildir quite obviously wasn't designed with IMAP in mind. IMAP
> > has
> > some (largely ridiculous) requirements that Maildir simply doesn't make
> > easy.
>
> UIDs mostly.
Agreed.
{{ although if UIDS were 64-bit, or better still- simply numeric, AND
didn't have that always-incrementing rule, anything from an mbox file
offset to an inode number would be satisfactory. }}
> > The largest problem (with Maildir) is this renaming of file identifiers
> > and moving things in and out of cur/. It's only necessary so programs
> > don't have to open() in order to read flags (after all, they JUST did a
> > readdir())...
>
> Out of cur/? open() to read flags? I don't understand.
Many flags can be answered with the contents of d_name. This fact may
make programs like mailx and pop3d very simple, but it makes generating
a mapping between Mark Crispin's UID and filenames very complex.
> > Since the names aren't going to change in cur/, you can get away with
> > just doing a stat() in there [[ after all, you just rename()'d it into
> > cur/ if you're working on new ]]
> >
> > Unfortunately, cur/ is often bigger than new/.
>
> Are you trying to say that files wouldn't be allowed to be renamed
> inside cur/ to change their flags?
No. I'm saying they don't have flags in new/. The real, legitimate
problem can't happen in new/ - only in cur/ because messages aren't
renamed within new/.
Although, it wouldn't be compatible with djb-Maildir, it would certainly
avoid this problem if rename() were never called in cur/....
[[ surely there are other places to store flags... ]]
--
Geo Carncross <geocar at internetconnection.net>
Internet Connection Reliable Web Hosting
http://www.internetconnection.net/
More information about the dovecot
mailing list