On Fri, 2005-09-23 at 21:48 -0400, Tom Metro wrote:
Timo Sirainen wrote:
Index File
[...] The file is modified by creating first exclusively owned "index.lock" file, updating it and then rename()ing it over the index file. This lock file shouldn't be held for a long time, so if it exists, and it doesn't get modified at all within 30 seconds, it can be overwritten. Before rename()ing over the index file, you should check with stat() that the index.lock file is still the same as you expect it to be (in case your process was temporarily stopped for over 30 secs).
The locking and handling of the index strikes me as a regression back to the mbox problems that Maildir tried to solve.
Replying a bit late, but anyway..
I don't think the above index.lock is a problem in any way. Actually Dovecot already uses similar method with Maildir's dovecot-uidlist file.
Maildir is lockless only in theory. Unless the maildir is globally locked while checking its contents, files may get temporarily lost with all of the filesystems that I know of.
I think the locking is only a problem if you're holding the lock for a long time. For example if you need to keep the mailbox locked as long as some IMAP client is reading/writing messages, that's bad. That's a problem with mbox, but not with maildir/dbox.
dbox's index.lock file needs to exist only in two situations:
While new message UIDs need to be allocated (as the final step of saving new mail(s) to mailbox). A global lock is needed for this with any kind of mail storage with IMAP, since UIDs must exist and they must always grow.
Writing message flag changes / expunging mails. These changes should go pretty quickly as well.
Have you considered other approaches, such as having the index be under control of a daemon, and use IPC to communicate events to that daemon, which could then exclusively handle modifying the file?
One of the biggest reasons for dbox's existance is that it needs to work well with clustered filesystems. And relying on a daemon running in only one computer kind of defeats cluster's purpose then..
Anyway I'm not sure how that would actually benefit anything. A single process could be a bottleneck if it handled all users' indexes, so it should be able to scale to multiple processes. And to avoid locking in those cases, each process should handle only a group of specific users. And if we're going there, it might as well be the imap process itself that does it all.
Redirecting all imap connections for one user to same imap process wouldn't be too difficult to implement (and it's been in my mind for a while already), but having pop3 and dovecot-lda also in the same process could get more tricky.
But does it really matter if the locking is handled by serialization (by making everything go through a single process) or actual locking? If there's only a single writer, the locking succeeds easily always. If there are multiple writers, you'll need to wait in both cases. Although I suppose serialization provides more fair scheduling.
And even if there was only a single process updating the index file, I'd probably still make it update the index using the exact same rename(index.lock, index) to make sure the file doesn't get corrupted in case of crashes :)
Any way you slice it, though, these are just approximations of a database server. Maybe embedding SQLite (just for indexes) is the answer?
Doesn't look like SQLite's locking (or writing in general) is in any way cheaper than what I'm currently doing:
http://www.sqlite.org/lockingv3.html
Actually it looks like it may even block on getting a shared lock if there are writes happening at the same time. I think many other databases don't block there but instead use rollback file to get the old data. Dovecot's index files and dbox's index files aren't needed to be read-locked at all, so they never block on reading.
SQLite could be useful for storing messages' and mailboxes' metadata (ANNOTATE, ANNOTATEMORE extension) since those require almost a full database to work properly, but I don't think it's a good idea for what Dovecot/dbox currently uses index files for. SQL in general isn't very well suited for them.