On 7.10.2012, at 0.32, Peer Heinlein wrote:
Several times we already had the problems, that accounts with more the 1.3 or 1.7 billion e-mails in one folder run out-of-memory, even if vsize_limit of 750 MB is set.
In this case, the lmtpd-process haven't been able to allocate more memory to read/write/update the index-files and crashed (and the index-files become corrupted at the end.)
I don't think dovecot.index file is much of a problem. With 1M mails it usually only takes something like 8-32 MB of memory depending on what mailbox format is used. dovecot.index.log file doesn't depend on the mailbox size at all. The main problem is dovecot.index.cache file.
I've thought about the cache file problems earlier also, but it's a bit difficult to figure out the best solution for it. And since nobody had actually complained about it, I hadn't really done anything about it. Also I hadn't previously thought of LMTP/LDA processes crashing because of it, that's a bigger problem than IMAP process crashing. Although I think you're getting a lot more of "mmap(dovecot.index.cache) failed: Out of memory" errors than crashes for large mailboxes?
So, subproblems related to this:
Filling out dovecot.index.cache too easily. A rather simple possibility that would catch all the possible ways would be to limit the max. size of a single message's cache entry to X kilobytes (64?). If it becomes larger, it's simply not written to the cache file.
Filling out memory too easily. If a long header is wanted to be cached or used for other purposes (e.g. Message-ID), it's still fully read into memory. Add some reasonable limit to max. length of a single header. Can't be too small, because some headers are legitimately pretty long (DKIM and such). Maybe something like 10kB would be safe enough for everyone?
If existing dovecot.index.cache is larger than X MB, shrink it first below X. Shrinking could begin with trying to do it the nice way of removing only unneeded data, but if that fails it could forcibly just remove some old messages. The X would have to be related to the process's VSZ limit.
Dovecot currently doesn't close index files immediately when mailbox is closed, because it's thinking that IMAP clients might reopen the index soon anyway. Max 3 indexes can be kept open, so 3x already different very large indexes can be too much. I'm not sure if this is actually useful at all. Maybe I should disable it for LMTP, or maybe just remove it completely.
The 3. part is what I like changing the least. An alternative solution would be to just not map the entire cache file into memory all at once. The code was actually originally designed to do just that, but munmap()ing + mmap()ing again wasn't very efficient. But for LMTP there's really no need to map the whole file. All it really wants is to read a couple of header records and then append to the file. Maybe it could use an alternative code path that would simply do that instead of mmap()ing anything. It wouldn't solve it for IMAP though.
I don't have a clear solution for that, Dovecot needs the subject information in its index files. But it looks like, it isn't a good idea to put the whole subject into the index. Maybe it's better/necessary to use just the first 50-70 characters for that and to keep the rest away from the index?
50-70 is way too little. The cached subject gets sent to the IMAP client. I think 200 bytes would be minimum and 1000 would be something I could probably even hardcode. But anyway, subject isn't the only way to trigger this and 1000 bytes is too low for some headers.