On Fri, 2008-04-18 at 14:14 +0200, Claude Frantz wrote:
When a large number of messages for many different users arrive nearly at the same time (e.g. a message to all the users), a large number of sleeping processes become active, the CPU usage increases to more than 80 or 90 % and the LA can be many hundreds for a long time. The machine is not more really operational from user's point of view. NFS is probably a problem here, but some NFS tests have not allowed to discover problems at this level. Now, I'm trying to find a "solution" in the way to delay the delivery of the messages to the maildirs. I think, it is not really a nice solution.
I think that the large number of now running processes is the rebuilding of the index. I am right here, Timo ?
Index updates typically don't take long. Although if client fetches some metadata that isn't in cache file, Dovecot needs to open the messages and parse them. That takes some I/O. This can be avoided with v1.1 by using deliver, since it updates the cache file immediately.
Although I'm still not sure if that would help, because a lot of clients just download the entire message body automatically, which in any case causes I/O.
At the present time, the mailfilter language of maildrop (and Courier) is used here in place of procmail used in the past. I ignore if using dovecot's LDA could be a solution in this environment. But the mailfilter language could be a problem. A migration to sieve is not really a choice, in my opinion. I have never seen a good sieve script testing software.
Can you tell maildrop to deliver the message to a specific mailbox using an external command (i.e. calling deliver) instead of delivering itself? Probably wouldn't be a difficult change to the code if it doesn't support it.