mdbox: Broken virtual size for mail UID ...

Dennis Schridde devurandom at gmx.net
Thu Jan 29 19:39:08 UTC 2015


Hello everyone!

I originally asked Timo for help in this case, but I understand he is very 
busy these days, so I am now posting my questions to this mailinglist.

On Sunday 28 Dec 2014 10:02:08 Timo Sirainen wrote:
> On 28 Dec 2014, at 07:06, Dennis Schridde <devurandom at gmx.net> wrote:
> > My harddrives crashed, and now I get "Broken virtual size for mail UID …"
> > messages in the dovecot logs from some mdbox folders. I assume the logged
> > emails are destroyed forever, but I would like to restore as much as
> > possible of them and keep them in their corrupted / incomplete state. How
> > do I do that?
> > 
> > I guess what I need is some kind of "doveadm fsck" command, that tries its
> > best to limit the damage to a minimum: Inform me about the broken emails
> > and then ask me whether they should be stored in truncated form, be
> > deleted, etc. Is there such a thing?
>
> http://dovecot.org/tools/mdbox-recover.pl

Thank you!

>From what I read in the Wiki and how I understand the script, I should proceed 
like this:

#!/bin/bash
broken_storage=$PWD/storage
recovered_storage=$PWD/storage-recovered

cd ${recovered_storage}

n=1
for f in ${broken_storage}/m.* ; do
  mdbox-recover.pl ${f}
  for f in msg.* ; do
    mv ${f} m.${n}
    $((n++))
  done
done

mv ${broken_storage} storage-backup
mv ${recovered_storage} storage

exit 0

Afterwards dovecot will now automatically regenerate the dovecot.map.index 
files from the now bare m.* files, and as the dovecot.index files in the 
actual mailboxes reference the GUID, and the messages in the m.* files have 
the GUID embedded, nothing will be lost. Is that correct?

To create proper assessment of the damage, I need to know a bit more:

1. In the script on line 43, the comment says:
# end of metadata block missing, finish the previous mail
That seems to be wrong, since idx2!=-1 and state==STATE_BODY on that line, 
while I understand it should be -1 and STATE_META if we were still reading the 
metadata block. I guess it should be something like:
# still in data block, finish the previous mail

2. Line 47 reads:
# truncated / broken data? just keep writing to previous file
That is the only error which could be detected in that script, right?

3. A message could also be corrupted by loosing its complete data block from 
\001\003 to \001\002, which cannot be detected in the script, right?

4. The only case to detect this and find other unexpected oddities would be to 
compare each msg.* in size and offset with the messages referenced in the 
dovecot.map.index file, correct? How do I do that, if the dovecot.map.index 
does not contain the GUID of the message? Do I need to fuzzy match the 
file_id, offset and size with my current values, or is there a more reliable 
way to do this?

5. Metadata refers to the RFC822 mail headers? Or are they also included in 
the body, and the mentioned metadata is dovecot-only metadata? If the latter, 
do they also contain the size of the body and similar information?

Best regards,
Dennis


More information about the dovecot mailing list