[Dovecot] 1.2.9 imap crash with backtrace

David Halik dhalik at jla.rutgers.edu
Fri Jan 15 01:13:33 EET 2010


FYI, we backed out of the "noac" change today. When our 20K accounts 
started coming to work the NetApp NFS server was pushing 70% CPU usage 
and 25K NFS Ops/s, which resulted in all kinds of other havoc as normal 
services started becoming slow. This server usally runs around 25% and 
5K, so such a large increase of load was too much to handle.

During the 12 hour window I didn't see a single uid error as expected, 
but the fix was worse than the problem.

On 01/13/2010 07:41 PM, David Halik wrote:
>
> Same here. I laughed because our help desk started sending us the 
> exact same complaints and then today I got a little bit of a red nose 
> when a director's mail "disappeared" in a meeting. ;) Whoops.
>
> It looks like users who end up with the off by 1 uid list rebuild and 
> crash experience and empty inbox until the list is rebuilt and 
> refreshed. I saw one user who experience the crash and then spewed 
> about 15K lines of "Duplicate message, uid -> uid". Since that takes 
> awhile they probably couldn't see anything in the meantime.
>
> Anyway, since we're hearing the complaints I went and remounted our 
> IMAP servers with the "noac" NFS option today. So far it seems to have 
> swept the problem under the rug, but our NFS server's Ops went from an 
> average of 3-5K to 10K-20K, and the cpu went from 10% to 50% of 
> constant load... so this is *definitely* only a temp fix. Hopefully 
> Timo will have time to analyze the problem once his move is all finished.
>
> -Dave
>
>> I hope your move is going well, and you get settled in and your internet
>> hooked up soon. It's got to be a rough process!
>>
>> Just for the record, we continue to see this crash fairly frequently
>> with a small subset of our users, enough so that they have started to
>> complain to the helpdesk staff about their mail 'disappearing and then
>> reappearing.' One user in particular has a mail client left open from
>> three hosts and has hit it 23 times in the last week, and 10 times
>> today.
>>
>> If there's any more information I can collect or anything I can do to
>> help get this resolved, please let me know!
>>
>> -Brad
>


-- 
================================
David Halik
System Administrator
OIT-CSS Rutgers University
dhalik at jla.rutgers.edu
================================



More information about the dovecot mailing list