On Tue, 2012-08-14 at 23:23 +0300, Timo Sirainen wrote:
On 11.8.2012, at 0.54, Jeff Gustafson wrote:
More dsync issues. We were running 2.1.7 and we updated to 2.1.9. Same problem with both versions. I'm getting an error 75 on about 40 boxes out of 1800. It is the same list of boxes every time we use 'dsync backup' to backup the server. dsync seems to stop communicating to the backup box (over ssh). strace just shows it sitting at a epoll_wait.
So you can easily reproduce this by running dsync for a specific user?
Yes. There is a subset of mailboxes that always time out.
Once the program quits (times out?), a 'du' shows the destination is smaller (200kbyte in one case).
As in, some of the mails didn't get synced? (doveadm fetch could be used to do a better comparison, file sizes don't necessarily mean anything.)
True, I will dump out the mailboxes and see if it truly was incomplete.
Those hangs are a little bit annoying to debug, and the whole code has been rewritten for v2.2 already in a way that should make the hangs pretty much impossible. Annoyingly v2.2 isn't ready yet..
I have found a manual work around. I use rsync to get the files over to
the backup machines, then I let the backup script keep things up to date. It is not the best way to go, but at least I have backups. I suppose I can check the log and continue to rsync things over until 2.2 comes out.
...Jeff