On Tue, 2008-09-09 at 17:31 +0200, Cor Bosman wrote:
60 seconds is also when Dovecot decides the dotlock file is stale. So I guess the cache file compression is taking longer than that. Hmm. I hadn't thought about that before, since it was supposed to happen rarely enough. But I guess during the compression other processes shouldn't be stuck waiting for it. I'll have to think about something - probably make the lock timeout only a couple of seconds and after that just fail to update it.
Did a couple of changes:
http://hg.dovecot.org/dovecot-1.1/rev/898e3810c014 http://hg.dovecot.org/dovecot-1.1/rev/e3c5acf92b53
But the cache compression really shouldn't take that long unless you're really really running out of disk I/O. I wonder if there's some problems with the locking / NFS caching. You do have mail_nfs_*=yes, right?
# grep nfs dovecot.conf mail_nfs_storage = yes mail_nfs_index = yes
What exactly is being compressed? We have 500MB mailspace, and plenty of users have that kind of space actually in use.
Basically deleted space is removed from dovecot.index.cache by recreating the file and leaving out the deleted parts.
I dont think the storage servers are slow in any way. Else we'd be seeing this much more. As i said, so far it's less than 1% of our users that I can find with that cache error, but most of those do say they have a lot of large files. So maybe it's something with lots of emails, and the need to update a large portion of the cache? It probably needs to go over the wire twice right? Once to read, once to write?
You could check how large the dovecot.index.cache file is for those users. Normally it's something like 10-20% of the mailbox size, but it really depends on the emails.
One thing is, we dont use deliver, so whenever dovecot hits the user's email it most likely will have to re-index all new email.
There's no "reindexing". It just sees a new mail and adds it to the index. Not a problem in any way.