[Dovecot] Maildir unreliability

Geo Carncross geocar-dovecot at internetconnection.net
Tue Oct 26 02:35:36 EEST 2004


On Mon, 2004-10-25 at 16:32, Timo Sirainen wrote:
> On 25.10.2004, at 23:11, Geo Carncross wrote:
> 
> > This is nonsense. The problem is that the behavior of readdir() is
> > confusing.
> >
> > Why should unlink() or rename() invalidate data that your C library
> > ALREADY READ from the directory?
> 
> Why do you think it was already read? It wasn't. That's the problem. An 
> existing renamed file may never be returned by one opendir() .. 
> readdir() .. closedir() loop.

Because strace says so. If you use getdents() directly, the problem
magnifies significantly [see below].

> > 		if (stat(d->d_name, &sb) == -1)continue;
> >
> > After your check for the "." in the first character of the d->d_name
> > (about line 41) and all will be good. No amount of twiddling with
> > USE_UNLINK or FILES is going to affect it.
> 
> Right. Because the stat() always fails so the whole thing does nothing. 
> If you actually do the correct check:
> 
>                  sprintf(path, PATH"/%s", d->d_name);
>                  if (stat(path, &sb) < 0)
>                          continue;
> 
> Then it's just as broken as before, but works more slowly.

Bah. I typed it correctly on my end :)

My problem here is I got lucky three times in a row, and misread your
post.

> > Of course. For readdir() to be atomic, it would need to do a system 
> > call
> > for each directory entry. This is exactly why readdir() doesn't, so 
> > that
> > you do one syscall for every (say) 50 entries, and if you want 
> > validity,
> > you'll do a stat() yourself.
> 
> I don't have a problem with readdir() returning a file that doesn't 
> exist anymore. I have a problem of readdir() not returning an existing 
> file. The exact opposite.

But it _doesn't_ exist. the opendir() gets a file descriptor- we don't
get to the old-data yet, but the new-data isn't necessarily put ahead of
our current offset (it isn't actually put anyplace reachable...).

I think I understand better what the problem is:

If you don't attempt to detect it internally (and instead just puts the
file number- :.* stripped off) you'll see:
	./readdir | sort -bn | uniq -c | sort -nr

produces entries LESS THAN 11- which you're hoping it wouldn't.

The only ways the operating system could do this are:
	1. be notified of name changes ([DI]notify - what courier does)
	2. change the semantics directories in the kernel such that NEW NAMES
always appear at the end of the directory.

#2 would be awful hard, but #1 could be handled right here, although it
wouldn't be portable [then, neither would #2, but read on...]



> > Now: Maildir quite obviously wasn't designed with IMAP in mind. IMAP 
> > has
> > some (largely ridiculous) requirements that Maildir simply doesn't make
> > easy.
> 
> UIDs mostly.

Agreed.

{{ although if UIDS were 64-bit, or better still- simply numeric, AND
didn't have that always-incrementing rule, anything from an mbox file
offset to an inode number would be satisfactory. }}

> > The largest problem (with Maildir) is this renaming of file identifiers
> > and moving things in and out of cur/. It's only necessary so programs
> > don't have to open() in order to read flags (after all, they JUST did a
> > readdir())...
> 
> Out of cur/? open() to read flags? I don't understand.

Many flags can be answered with the contents of d_name. This fact may
make programs like mailx and pop3d very simple, but it makes generating
a mapping between Mark Crispin's UID and filenames very complex.


> > Since the names aren't going to change in cur/, you can get away with
> > just doing a stat() in there [[ after all, you just rename()'d it into
> > cur/ if you're working on new ]]
> >
> > Unfortunately, cur/ is often bigger than new/.
> 
> Are you trying to say that files wouldn't be allowed to be renamed 
> inside cur/ to change their flags?

No. I'm saying they don't have flags in new/. The real, legitimate
problem can't happen in new/ - only in cur/ because messages aren't
renamed within new/.

Although, it wouldn't be compatible with djb-Maildir, it would certainly
avoid this problem if rename() were never called in cur/....

[[ surely there are other places to store flags... ]]


-- 
Geo Carncross <geocar at internetconnection.net>
Internet Connection Reliable Web Hosting
http://www.internetconnection.net/




More information about the dovecot mailing list