[Dovecot] quick question

22 Jan 2010


      Timo (and anyone else who feels like chiming in),
I was just wondering if you'd be able to tell me if the amount of
corruption I see on a daily basis is what you consider "average" for our
current setup and traffic. Now that we are no longer experiencing any
core dumps with the latest patches since our migration from courier two
months ago, I'd like to know what is expected as operational norms.
Prior to this we had never used Dovecot, so I have nothing to go on.
Our physical setup is 10 Centos 5.4 x86_64 IMAP/POP servers, all with
the same NFS backend where the index, control, and Maildir's for the
users reside. Accessing this are direct connections from clients, plus
multiple squirrelmail webservers, and pine users, all at the same time
with layer4 switch connection load balancing.
Each server has an average of about 400 connections, for a total of
around concurrent 4000 during a normal business day. This is out of a
possible user population of about 15,000.
All our dovecot servers syslog to one machine, and on average I see
about 50-75 instances of file corruption per day. I'm not counting each
line, since some instances of corruption generate a log message for each
uid that's wrong. This is just me counting "user A was corrupted once at
10:00, user B was corrupted at 10:25" for example.
Examples of the corruption are as follows:
###########
Corrupted transaction log file ..../dovecot/.INBOX/dovecot.index.log seq
28: Invalid transaction log size (32692 vs 32800):
...../dovecot/.INBOX/dovecot.index.log (sync_offset=32692)
Corrupted index cache file ...../dovecot/.Sent
Messages/dovecot.index.cache: Corrupted physical size for uid=624: 0 !=
53490263
Corrupted transaction log file ..../dovecot/.INBOX/dovecot.index.log seq
66: Unexpected garbage at EOF (sync_offset=21608)
Corrupted transaction log file
...../dovecot/.Trash.RFA/dovecot.index.log seq 2: indexid changed
1264098644 -> 1264098664 (sync_offset=0)
Corrupted index cache file ...../dovecot/.INBOX/dovecot.index.cache:
invalid record size
Corrupted index cache file ...../dovecot/.INBOX/dovecot.index.cache:
field index too large (33 >= 19)
Corrupted transaction log file ..../dovecot/.INBOX/dovecot.index.log seq
40: record size too small (type=0x0, offset=5788, size=0) (sync_offset=5812)
##########
These are most of the unique messages I could find, although the
majority are the same as the first two I posted. So, my question, is
this normal for a setup such as ours? I've been arguing with my boss
over this since the switch. My opinion is that with a setup such as ours
where a user can be logged in using Thunderbird, Squirrelmail, and their
Blackberry all concurrently at the same time, there will always be the
occasional index/log corruption.
Unfortunately, he is of the opinion that there should rarely be any and
there is a design flaw in how Dovecot is designed to work with multiple
services with an NFS backend.
What has been your experience so far?
Thanks,
-Dave
--
David Halik
System Administrator
OIT-CSS Rutgers University
dhalik@jla.rutgers.edu

[Dovecot] quick question

David Halik

--

David Halik System Administrator OIT-CSS Rutgers University dhalik@jla.rutgers.edu