On 13.5.2004, at 12:34, Maikel Verheijen wrote:
While (bluntly) testing the prerelease version of dovecot on our
mailfarm (just for the webmail imap) we noticed some small problems:
Not very good idea at least until "1.0 alpha". I can still pretty
easily make it crash and complain about corrupt indexes.
- While our OLD dovecot/imap drove the load to a maximum of 1 over a
day after running, the new dovecot/imap pushes it easily to 20. This
is mainly due to a LOT more disk activity. This is probably caused by
the fact we a) use indexes on disk and b) there are no indexes left. I
hope this will smooth out when most of the indexes are made.
I think this is mostly because the 1.0-tests don't cache anything. The
index files contain only message UIDs and flags, everything else
requires opening and parsing the message file. I'll fix the caching
once other things seem to be working.
- We notice some very strange "delete" flag things in our IMP webmail
system (working on a test environment to be more specific). The
problem is that if you mark some messages "deleted", a lot of messages
that do NOT have "deleted" flag are also in the deleted list. Since
nothing changed on the webmail, it must be in the way dovecot responds
different in the new version on certain requests.
Hm.. I'm not sure about this. Maybe concurrent access breaks it
sometimes..
In the dotlocking code, it seems that dovecot uses time() for it's
internal locktime, and uses stat() to see if the file-time of the lock
is different. Since we use nfs there might be a time-difference on the
file creation and time() when our nfs gets "busy". This happens a lot
when the indexes are created for multiple users at the same time
resulting in a lot of:May 12 08:57:44 mf1 dovecot[3763]: imap(user@domain.tld): Our dotlock
file
/var/mail/mounted/d/do/domain.tld/user/Maildir/.INBOX/ dovecot.index.log.lock was modified (1084345063 vs 1084345062),
assuming it wasn't overriddenI am not completely sure if this interrupts imap traffic. I think this
MIGHT be solved by storing a stat() time as the locktime instead of a
time(), but this might be intentional.
It does store and compare stat() times for that check.. There was a bug
where Dovecot itself overwrote the lock file and caused that error, but
it was fixed in test8 already. Maybe there's other such problems.
- I patched dovecot to report its pid in syslog messages, in order to
find the "killed by signal X" where X are mostly 11 and 6 processes.
The pids reported do not show up in the rest of the log file, so these
may be "disconnected" sessions? It seems that the "old" version has
the same problem, and nobody really complained it was broken, so this
isn't a big issue.
What else should there be in log file about them? Killed by signal 11
means just that it crashed without any specific reason. Signal 6 is
abort(), that should write the error into log file.