Today morning I had an idea how to make indexes more scalable with multiple concurrent writers. As a side effect it also made the locking issues much more simpler. So I thought I'd go and rewrite all the indexing code, it had gotten way too ugly and difficult to understand and maintain.
I also thought to separate the index handling completely from mailbox handling. So there would be lib-index and lib-mailbox. lib-storage would then tie them together.
Mailbox API still feels somewhat dirty though.. I didn't want it to depend on lib-index and I didn't want to duplicate any lib-index functionality in it to avoid uselessly allocating memory, so I can't reference to mails directly with their sequence number or UID. So I'm passing (void *id, size_t id_len) around - with maildir that would be the filename and with mbox it would be current offset in file.
Suggestions welcome about both APIs.
Some of the changes in indexes and it's API:
- "modify log" -> "transaction log" which was the big change. All flag updates and expunges are first written to transaction log and only when committing it they will be moved into the main index.
The transaction writes require the file locked for only very short time, so with lots of concurrent writers it means that several sessions could have written their wanted changes to log file, but they're all blocking on getting them moved into main index. When one session finally gets a lock in the main lock file, it writes everyone's changes into it. If lock can't be acquired in a second or so, the main index will be rewritten into temp file and rename()d over.
This means that we can have tons of sessions reading and writing to same shared index file and no session can block others from reading or writing. Currently yoou could easily block changes for a very long time with eg. FETCH 1:* BODY.PEEK[] and reading it 1 bytes/sec.
mail_index will be a class interface, so it's implementation can be changed at runtime. In-memory indexing is going to be a totally different implementation this time with better memory usage. Could be useful for optimizing some special cases as well.
You'll get "views" into mailbox. A view will take care of keeping sequences synchronized according to what each client session thinks they are. Currently lib-storage does that in kind of ugly way. Hopefully this means that we can get rid of sequence parsing in lib-storage completely and move it into imap-only code.
You don't directly manipulate with locks and there's no rules of in which order you have to do something. You get views and transactions and use them.
You can have multiple views and transactions, so one opened index could be used to handle multiple client sessions. I'm not sure if that will ever be useful though :)
Error correction will be done automatically this time. Whenever some error is noticed, it will be fixed immediately and unless it changes syncing state, it doesn't force client to disconnect. Also it hopefully will be possible to change from disk-indexing to memory-indexing on the fly in case you run out of disk space.
I'm trying to get it NFS-safe..
mail-cache that I recently rewrote doesn't change. The bad thing is that it seems to be broken in some way and probably needs rewriting anyway.. And I'm not sure if the problem is just in the code, or if the design itself is somehow broken with it. I'm fearing that memory doesn't somehow keep up with changes by other sessions, since we're not locking cache file for reading..
Timo Sirainen wrote :
with maildir that would be the filename and with mbox it would be current offset in file.
Is it a good idea to use the current offset of a message in a file since there are (several) programs that rewrite entirely mailboxes ? I find it rather dirty to do this.
In UW-IMAP, an UID is added into the message headers.
-- DINH V. Hoa,
"Tu as lu combien de bandes dessinées ce mois-ci ? 13 Go"
On Sun, 2003-11-23 at 01:22, DINH Viet Hoa wrote:
with maildir that would be the filename and with mbox it would be current offset in file.
Is it a good idea to use the current offset of a message in a file since there are (several) programs that rewrite entirely mailboxes ? I find it rather dirty to do this.
In UW-IMAP, an UID is added into the message headers.
But if you give UID, there's no way to know where in the mbox file the message begins unless you store the UID -> offset in memory, which index already does, so why not just give the offset..
The point was that you have to have the mailbox locked and synchronized before trying to open any messages. Synchronizing updates the offsets and while mailbox is locked the offsets can't change.
Timo Sirainen wrote :
But if you give UID, there's no way to know where in the mbox file the message begins unless you store the UID -> offset in memory, which index already does, so why not just give the offset..
The point was that you have to have the mailbox locked and synchronized before trying to open any messages. Synchronizing updates the offsets and while mailbox is locked the offsets can't change.
sorry, I thought it was for external IMAP UID. These are, in fact, internal usage identifiers.
-- DINH V. Hoa,
"monde de merde" -- Erwan David
hi, it's an interesting read, but go back to a higher level for a moment. I like dovecot since it's use simple files as mail storage (not like cyrus). in case of dovecot if something happend I can easily switch to another imap server, which is one of the most important feature in a production enviroment (there should have to be a way of escape). other important feature is the speed (when we have a few hundred of mailboxes). some kind of indexing seems to be a good way of to do this. but I always feel that dovecot reinvent the wheel, since there is a dozen of database system which has nothing else to do just indexing (ok it's not true, but..). they probably do it in the right way (or at least we can find some) and they has a few years of experience. they do right the indexing, the locking, the transactions, etc.. so why we not use one realy fast and good database engine to index our mail storage? the only reason what I can accept in this case, that this is some very special type of database and dovecot can use such algorithm which suited to this problem better then a general indexing algorithms. is this true?
another thing which always come to my mind when think about speed: why we do the indexing when we look into the folders, since IMHO it'd be more efficient if we do it at the mail delivery time. the mail arrival is more balanced during the time, so the system load is more balanced. so there can be the delivery helper apps.
- one optional deivery helper application which can do the indexing during the deivery time,
- indexing during remove, move, copy etc. imap operations,
- and the current (eg, the new indexing engine) if someone do not use
just my 2c.
-- Levente "Si vis pacem para bellum!"
hi, it's an interesting read, but go back to a higher level for a moment. I like dovecot since it's use simple files as mail storage (not like cyrus). in case of dovecot if something happend I can easily switch to another imap server, which is one of the most important feature in a production enviroment (there should have to be a way of escape). other important feature is the speed (when we have a few hundred of mailboxes). some kind of indexing seems to be a good way of to do this. but I always feel that dovecot reinvent the wheel, since there is a dozen of database system which has nothing else to do just indexing (ok it's not true, but..). they probably do it in the right way (or at least we can find some) and they has a few years of experience. they do right the indexing, the locking, the transactions, etc.. so why we not use one realy fast and good database engine to index our mail storage? the only reason what I can accept in this case, that this is some very special type of database and dovecot can use such algorithm which suited to this problem better then a general indexing algorithms. is this true?
another thing which always come to my mind when think about speed: why we do the indexing when we look into the folders, since IMHO it'd be more efficient if we do it at the mail delivery time. the mail arrival is more balanced during the time, so the system load is more balanced. so there can be the delivery helper apps.
- one optional deivery helper application which can do the indexing during the deivery time,
- indexing during remove, move, copy etc. imap operations,
- and the current (eg, the new indexing engine) if someone do not use
just my 2c.
-- Levente "Si vis pacem para bellum!"
Farkas Levente wrote :
so why we not use one realy fast and good database engine to index our mail storage? the only reason what I can accept in this case, that this is some very special type of database and dovecot can use such algorithm which suited to this problem better then a general indexing algorithms. is this true?
What I do in libEtPan is to use a DB (Sleepycat/Berkeley) for that. It is rather efficient to store the index. I just need to add NFS safety by creating a dotlock file when accessing this file, and serialize data before before storing them in the database.
-- DINH V. Hoa,
etPan! - newsreader, mail user agent -- http://libetpan.sf.net/etpan
On Mon, 2003-11-24 at 14:19, Farkas Levente wrote:
but I always feel that dovecot reinvent the wheel, since there is a dozen of database system which has nothing else to do just indexing (ok it's not true, but..). they probably do it in the right way (or at least we can find some) and they has a few years of experience. they do right the indexing, the locking, the transactions, etc.. so why we not use one realy fast and good database engine to index our mail storage? the only reason what I can accept in this case, that this is some very special type of database and dovecot can use such algorithm which suited to this problem better then a general indexing algorithms. is this true?
Pretty much, yes. Databases normally do binary tree (or similiar) indexes. Sometimes hash and bitmap indexes. Dovecot doesn't really use any of these. Or, well, cache file would work pretty well in a database since it's for UID -> some cached message data lookups.
Rest of the indexes wouldn't work all that well in database though. For example modify/transaction log can be used to quickly figure out what another session changed in a mailbox (flags, expunges). How many databases allow you to easily and quickly look at transaction history? Well, it would be possible to create our own transaction log table for each mailbox, but it of course costs more.
Another problem is how to do message sequence -> UID lookups. With old indexes we're doing it in a quite difficult way, but with new indexes it's a simple array lookup in index file. I don't think any database allows this kind of queries, so what we have to constantly keep an array of all message UIDs in memory. Not that bad necessarily, but it's extra memory overhead.
Anyway, some day I will write SQL database support, but even if it was just Berkeley DB I bet there would be rather large memory, disk usage and CPU usage overhead compared to what we have now.
I'm not sure if there are lock contention problems with databases.
another thing which always come to my mind when think about speed: why we do the indexing when we look into the folders, since IMHO it'd be more efficient if we do it at the mail delivery time. the mail arrival is more balanced during the time, so the system load is more balanced. so there can be the delivery helper apps.
- one optional deivery helper application which can do the indexing during the deivery time,
- indexing during remove, move, copy etc. imap operations,
- and the current (eg, the new indexing engine) if someone do not use
Yep, this has been in TODO for a while. Once this is possible it's also simple to add support for UIDPLUS extension. And update mailbox quotas quickly and accurately. It's a bit difficult to implement to 0.99.10 code base, but should be pretty easy to add to current CVS. Once new indexing works, I'll add this immediately.
Timo Sirainen wrote:
On Mon, 2003-11-24 at 14:19, Farkas Levente wrote:
another thing which always come to my mind when think about speed: why we do the indexing when we look into the folders, since IMHO it'd be more efficient if we do it at the mail delivery time. the mail arrival is more balanced during the time, so the system load is more balanced. so there can be the delivery helper apps.
- one optional deivery helper application which can do the indexing during the deivery time,
- indexing during remove, move, copy etc. imap operations,
- and the current (eg, the new indexing engine) if someone do not use
Yep, this has been in TODO for a while. Once this is possible it's also simple to add support for UIDPLUS extension. And update mailbox quotas quickly and accurately. It's a bit difficult to implement to 0.99.10 code base, but should be pretty easy to add to current CVS. Once new indexing works, I'll add this immediately.
And this would be as easy as to write a stdin -> imap-append app to go with procmail etc? And this would not trigger reindexing, but rather just index the appended part, am I wrong?
/Jonas (just done that for java, but can rewrite it into c...)
On Tuesday, Nov 25, 2003, at 10:08 Europe/Helsinki, Jonas Bosson wrote:
Timo Sirainen wrote:
Yep, this has been in TODO for a while. Once this is possible it's also simple to add support for UIDPLUS extension. And update mailbox quotas quickly and accurately. It's a bit difficult to implement to 0.99.10 code base, but should be pretty easy to add to current CVS. Once new indexing works, I'll add this immediately.
And this would be as easy as to write a stdin -> imap-append app to go with procmail etc? And this would not trigger reindexing, but rather just index the appended part, am I wrong?
Yes, it would work that way once APPEND works that way.. Which is not yet.
participants (5)
-
DINH Viet Hoa
-
Farkas Levente
-
Farkas Levente
-
Jonas Bosson
-
Timo Sirainen