60 seconds is also when Dovecot decides the dotlock file is stale. So I guess the cache file compression is taking longer than that. Hmm. I hadn't thought about that before, since it was supposed to happen rarely enough. But I guess during the compression other processes shouldn't be stuck waiting for it. I'll have to think about something - probably make the lock timeout only a couple of seconds and after that just fail to update it.
But the cache compression really shouldn't take that long unless you're really really running out of disk I/O. I wonder if there's some problems with the locking / NFS caching. You do have mail_nfs_*=yes, right?
# grep nfs dovecot.conf mail_nfs_storage = yes mail_nfs_index = yes
What exactly is being compressed? We have 500MB mailspace, and plenty of users have that kind of space actually in use.
I dont think the storage servers are slow in any way. Else we'd be seeing this much more. As i said, so far it's less than 1% of our users that I can find with that cache error, but most of those do say they have a lot of large files. So maybe it's something with lots of emails, and the need to update a large portion of the cache? It probably needs to go over the wire twice right? Once to read, once to write?
One thing is, we dont use deliver, so whenever dovecot hits the user's email it most likely will have to re-index all new email.
Cor