[Dovecot] CRASH: mail-cache-fields.c crash - new info, hacked 'solution'

JB Zimmerman jbz at ximian.com
Wed Nov 1 21:28:25 UTC 2006


I'm baaaaack. :-)  

  I've managed to implement a suggestion from Hans Morten Kind from this
list that seems to have stopped the crashing.  However, my hack  -
commenting out a call to i_unreached() - makes me queasy because I have
no idea of the ramifications of it (I don't habitually code, myself).
So I wanted to lay it out for y'all in case this is a problem that you
feel should be looked at.

 So, here's my situation.  Using dovecot-1.0.rc10 downloaded from
dovecot.org.  Built an RPM locally on my machine (as opposed to prior
attempts, which used the AT rpms), which is running RHEL 4AS with all
updates.  Did *not* configure in postgres, mysql, sqlite, ldap-auth.
Other than that, stock (openssl included, eg) - some file locations
taken from AT RPMs' spec (redhat specific file locs).  No patches
performed. SPEC available if y'all think it'd help; the RPM built with
no complaints, installed same.  Using Maildir format, upgraded from a
Courier install, so .folder.subfolder structure.

  Error behavior:  When a user attempted to open a folder containing
large numbers of messages (roughly 100k+ messages, as far as we can
tell) they immediately got an error saying the server has disconnected.
On the server side, I got this in the log (hostname 'magneto' obviously)

---cut---
Nov  1 15:18:16 magneto dovecot: IMAP(joeuser): file
mail-cache-fields.c: line 26: unreached
Nov  1 15:18:16 magneto dovecot: child 17599 (imap) killed with signal 6
---cut---

Now, the folder in question is a folder of CVS commit messages (hence
the size).  If I go into the folder ("/home/joeuser/Maildir/.GNOME CVS
commits/") and do 'rm -f dovecot-*' and then have the user try again,
then they can open the folder and get a message list.  dovecot will
rebuild the various index files.  However, as soon as they click on an
individual message, bam, the same error behavior - and from then on,
they can't get into the folder again unless we remove their dovecot
files again.

We tried this using Evolution, mutt and pine as the clients.  All
exhibited identical behavior.  This is coming over TLS.

NOW THE FIX:

I made a change to the source (gasp!) that honestly I have no idea the
ramifications of, but it has...well, not *fixed*, but sorta fixed it.
As per Hans Morten Kind, I commented out the i_unreached() call in
field_has_fixed_size().  After this, the mail is readable as is the
folder list, but now there is an error message in the log.  First things
first, here's the change I made to
dovecot-1.0.rc10/src/lib-index/mail-cache-fields.c:

---cut---

@@ -23,7 +23,7 @@
                return FALSE;
        }
 
-       i_unreached();
+/*     i_unreached(); */
        return FALSE;
 }

---cut---


...and here's what now happens in the log, to the same mail folder as
above:

---cut---
Nov  1 15:59:42 magneto dovecot: IMAP(joeuser): Corrupted index cache
file /home/joeuser/Maildir/.GNOME CVS commits/dovecot.index.cache: field
header names corrupted
---cut---

At that point, I deleted the cache files again, and the error goes away.
I also notice that the index.cache file in that folder is much, much
larger than it was, from which I posit that the above error was because
the crashing imap process had left an incomplete index file.  Removing
it thus forced a rebuild with the new code which seems to have fixed the
problem.

Thank you all for your patience. I hand this willingly over to the list.

jb

-- 
------------------------------------
J.B. Zimmerman        jbz at ximian.com
       Network Administrator
Ximian    -    http://www.ximian.com
...a tiny little division of Novell.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://dovecot.org/pipermail/dovecot/attachments/20061101/d0278da5/attachment.pgp 


More information about the dovecot mailing list