[Dovecot] dsync deleting too many emails (sdbox)
I'm using dsync to synchronize emails on a laptop where wifi connectivity sometimes fails in the middle of a sync. I have a shell script that runs dsync, and here is one line of it including the output of dsync:
- dsync -f -m realmail mirror /home/paulproteus/projects/ssh-attach/run ssh rose.makesad.us dsync dsync-local(paulproteus): Error: dbox /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails: Unexpectedly lost uid=337460 dsync-local(paulproteus): Error: msg guid lookup failed: Internal error occurred. Refer to server log for more information. [2012-02-02 11:02:12] dsync-local(paulproteus): Warning: sdbox /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails: Rebuilding index dsync-local(paulproteus): Warning: sdbox /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails: Ignoring invalid filename 130608.broken dsync-local(paulproteus): Warning: sdbox /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails: Ignoring invalid filename 66159.broken dsync-local(paulproteus): Error: Corrupted dbox file /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/u.66159 (around offset=16): EOF reading msg header (got 0/30 bytes) dsync-local(paulproteus): Error: link(/home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/u.66159, /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/u.66159.broken) failed: File exists dsync-local(paulproteus): Error: Corrupted dbox file /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/u.66159 (around offset=16): EOF reading msg header (got 0/30 bytes) dsync-local(paulproteus): Warning: sdbox: Skipping unfixable file: /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/u.66159 dsync-local(paulproteus): Warning: sdbox /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails: Ignoring invalid filename 336269.broken dsync-local(paulproteus): Error: Corrupted dbox file /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/u.130608 (around offset=16): EOF reading msg header (got 0/30 bytes) dsync-local(paulproteus): Error: link(/home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/u.130608, /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/u.130608.broken) failed: File exists dsync-local(paulproteus): Error: Corrupted dbox file /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/u.130608 (around offset=16): EOF reading msg header (got 0/30 bytes) dsync-local(paulproteus): Warning: sdbox: Skipping unfixable file: /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/u.130608 dsync-remote(paulproteus): Error: proxy server timed out dsync-local(paulproteus): Error: Corrupted dbox file /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/u.336269 (around offset=16): EOF reading msg header (got 0/30 bytes) dsync-local(paulproteus): Error: link(/home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/u.336269, /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/u.336269.broken) failed: File exists dsync-local(paulproteus): Error: Corrupted dbox file /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/u.336269 (around offset=16): EOF reading msg header (got 0/30 bytes) dsync-local(paulproteus): Warning: sdbox: Skipping unfixable file: /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/u.336269 dsync-local(paulproteus): Warning: Transaction log file /home/paulproteus/Maildir/dbox/mailboxes/realmail/dbox-Mails/dovecot.index.log was locked for 1528 seconds
That seemed problematic, but not dangerous.
Then I ran a fresh sync, and found 3000 (of 60,000) messages had been deleted and expunged from the "realmail" box.
I'm guessing this is some bad interaction with sdbox and partial file downloads?
I haven't read the code for this, but I would guess the dsync process isn't being atomic about file transfers, so it is leaving half-completed transfers in place, which results in corrupt files when they're next examined.
dovecot -n output:
# 2.0.15: /etc/dovecot/dovecot.conf # OS: Linux 3.0.0-2-amd64 x86_64 Debian wheezy/sid mail_location = sdbox:~/Maildir/dbox passdb { driver = pam } protocols = " imap" ssl_cert = </etc/ssl/certs/dovecot.pem ssl_key = </etc/ssl/private/dovecot.pem userdb { driver = passwd }
-- Asheesh.
On Thu, 2012-02-02 at 14:59 -0500, Asheesh Laroia wrote:
I'm guessing this is some bad interaction with sdbox and partial file downloads?
I haven't read the code for this, but I would guess the dsync process isn't being atomic about file transfers, so it is leaving half-completed transfers in place, which results in corrupt files when they're next examined.
There were some problems related to this in dbox, although in your case it seems to be worse than what it should..
I've anyway done several fixes in v2.1. Can you try if these problems happen with it too? And in any case cleanup the dbox from the *.broken files, so that "doveadm force-resync" won't give any errors.
On 9.2.2012, at 21.47, Timo Sirainen wrote:
I've anyway done several fixes in v2.1. Can you try if these problems happen with it too? And in any case cleanup the dbox from the *.broken files, so that "doveadm force-resync" won't give any errors.
A bit more specifically: The last such dbox bug was fixed only today, so you'd need v2.1 hg version or wait for v2.1.rc6 which should happen this week.
And in general: It would be helpful to have a clean fully working dbox, and then know the *first* error(s) that gets printed about dsync corrupting it. Otherwise it's difficult to guess what are some old problems and what are new ones and which problems happens only because of another problem.
On Thu, 9 Feb 2012, Timo Sirainen wrote:
On 9.2.2012, at 21.47, Timo Sirainen wrote:
I've anyway done several fixes in v2.1. Can you try if these problems happen with it too? And in any case cleanup the dbox from the *.broken files, so that "doveadm force-resync" won't give any errors.
A bit more specifically: The last such dbox bug was fixed only today, so you'd need v2.1 hg version or wait for v2.1.rc6 which should happen this week.
And in general: It would be helpful to have a clean fully working dbox, and then know the *first* error(s) that gets printed about dsync corrupting it. Otherwise it's difficult to guess what are some old problems and what are new ones and which problems happens only because of another problem.
Good to know. This weekend I can try to set up something of a 'lab' for testing dsync + (s)dbox, both to see if I can reproduce the errors with the old versions, and to see if the new versions fix them. I'll keep in mind the consideration of knowing the first error that gets printed!
-- Asheesh.
participants (3)
-
Asheesh Laroia
-
Asheesh Laroia
-
Timo Sirainen