[Dovecot] Webmail project : to cache or not to cache emails

Michael M Slusarz slusarz at curecanti.org
Mon Jun 13 19:56:34 EEST 2011


Quoting Timo Sirainen <tss at iki.fi>:

> On Thu, 2011-06-09 at 14:45 +0200, Vincent Richomme wrote:
>> Hi,
>>
>> I plan to develop a new web service where one of its module is a
>> webmail and I am thinking about
>> some implementation details. From a interface point of view I wanted to
>> adopt the same logic a outlook(desktop) or yahoo webmail
>> and not split emails into pages. I mean I just want a single datatable
>> view where user can scroll to
>> see his old messages and data are updated only when user release the
>> vertical scrollbar.

We have implemented this in IMP.  Best of luck - it is not an easy  
thing to implement correctly.

>> When I look at current webmail (roundcube for instance) I can see that
>> a cache is used to stored
>> emails headers and I was wondering if it was really necessary ?
>
> You mean Roundcube's own local cache?

For webmail, our theory is that caching is pretty much only important  
when viewing the mailbox list.  There is an expectation from a user  
that when viewing a message, it may take a bit of time to grab the  
data and render.  Caching body data and full header text for all users  
and all messages will instead most likely cause your cache to fill to  
quickly, and provides little practical benefit.

So instead focus on caching for the mailbox list.  We cache all  
envelope data, imapdate, size, and flags.  We also cache *specific*  
headers that are unobtainable from envelope data, but are used when  
rendering the message list (e.g. X-Priority, List Headers).  But there  
is no need to cache the entire header text - some of it would be  
duplicative (e.g. envelope headers), and a good chunk of the remaining  
data is worthless for most use cases (e.g. Received).

Finally, we cache flags.  But note: you absolutely MUST implement  
CONDSTORE support on your client or else mailbox caching is  
essentially worthless.  A client that has to grab the flag data on  
every connection pretty much eliminates all benefits gained from  
caching in the first place.  Most other webmail implementations claim  
they cache, but they have no CONDSTORE support, so their caching is  
either broken (flag changes from other clients don't appear) or of  
limited value (since an IMAP server may need to parse through the  
entire list of requested messages to grab this information, which may  
be a slow operation if using something like mboxes on the storage side).

>> Once a user has passed the login process I would like to retrieve only
>> the emails that will be displayed
>> (actually a bit more, I am thinking of 150) but I am wondering if it's
>> a good idea to not use a cache for headers.
>> For instance if I have thousands of users on the same machine, will it
>> support it ? Will it be fast enough ?
>> Of course I will try by myself the different options but would be
>> curious to have some opinions.
>
> Dovecot's cache is also pretty fast. But then again it is easier to
> scale web servers than IMAP servers by just adding more servers.

I'll admit that a webmail cache is not as important when using with a  
server that already supports caching natively (Dovecot, Cyrus).   
However, you still gain benefits because the local cache is in a  
format that is directly usable by the client program - in other words,  
you save on reparsing the IMAP data -> local data structure.

But a webmail cache is a necessity when using IMAP servers that don't  
natively cache (e.g. Courier).

michael



More information about the dovecot mailing list