[Dovecot] 1.0-test23 and caching decisions
mbox fixes mostly. Began some other things, but none of them actually work yet:
- Use X-UIDL header in mboxes for POP3 UIDL list
- Cache file compression
- Some cleverness for cache file for figuring out what to cache
Well, I might as well explain the last one now.
Users can be divided to three groups:
Most users will use only a single IMAP client which caches everything locally. For these users it's quite pointless to do any kind of caching as it only wastes disk space. That might also mean more disk I/O.
Some users use multiple IMAP clients which cache everything locally. These could benefit from caching until all clients have fetched the data. After that it's useless.
Some clients don't do permanent local caching at all. For example Pine and webmails. These clients would benefit from caching everything. Some locally caching clients might also access some data from server again, such as when searching messages. They could benefit from caching only these fields.
After thinking about these a while, I figured out that people who care about performance most will be using the upcoming Dovecot LDA anyway which updates the indexes/cache immediately. In that case even the first user group would benefit from caching the same way as second group. LDA reads the mail anyway, so it might as well extract some information about it and store them into cache.
So, group 1. and 2. could be optimally implemented by keeping things cached only for a while. I thought a week would be good. When cache file is compressed, everything older than week will be dropped.
But how to figure out if user is in group 3? One quite easy rule would be to see if client is accessing messages older than a week. But with only that rule we might have already dropped useful cached data. It's not very nice if we have to read and cache it twice.
Most locally caching clients always fetch new messages (all but body) when they see them. They fetch them in ascending order. Noncaching clients might fetch messages in pretty much any order, as they usually don't fetch everything they can, only what's visible in screen. Some will use server side sorting/threading which also makes messages to be fetched in random order. Second rule would then be that if a session doesn't fetch messages in ascending order, the fetched field type will be permanently cached.
So, we have three caching decisions:
- Don't cache: Clients have never wanted the field
- Cache temporarily: Clients want this only once
- Cache permanently: Clients want this more than once
Different mailboxes have different decisions. Different fields have different decisions.
There are some problems, such as if a client accesses message older than a week, we can't know if user just started using a new client which is just filling it's local cache for the first time. Or it might be a client user hasn't just used for over a week. In these cases we shouldn't have marked the field to be permanently cached. User might also switch clients from non-caching to caching.
So we should re-evaluate our caching decisions from time to time. This is done by checking the above rules constantly and marking when was the last time the decision was right. If decision hasn't matched for two months, it's changed. I picked two months because people go to at least one month vacations where they might still be reading mails, but with different clients.
participants (1)
-
Timo Sirainen