Re: [Dovecot] "Time just moved backwards" in Dovecot in a Xen DomU

6 Oct 2009


      On 6/10/2009 12:54 PM, PGNet Dev wrote:
...
<snip - from dom0>
looking at my ntp logs around the same time(s).
...
5 Oct 16:41:17 ntpd[5696]: synchronized to 64.125.78.85, stratum 1
5 Oct 16:51:38 ntpd[5696]: time reset -2.140133 s
5 Oct 16:56:40 ntpd[5696]: synchronized to 66.220.9.122, stratum 1
5 Oct 17:01:28 ntpd[5696]: synchronized to 64.125.78.85, stratum 1
5 Oct 17:07:20 ntpd[5696]: time reset -2.137760 s
5 Oct 17:11:49 ntpd[5696]: synchronized to 204.152.184.72, stratum 1
This indicates that ntpd is actually stepping the time 2 seconds into
the past approx every 900 seconds. So dovecot is correct that time has
moved backwards. You need to stop time moving backwards :-).
[so not dovecot's fault, and likely not xen's fault either]
I'm no ntp expert, but I wonder if searching for 900s in the ntpd man
page might help (caught my eye due to the step every 15 minutes -
network congestion and excessive jitter causing stepping)? Otherwise
perhaps a problem with a bad hardware driver stalling in the middle of
an interrupt occasionally. Sorry - can't provide any further pointers.
It is highly dependent on your hardware, kernel & drivers. If you have
any other physical servers and they are also having 'time reset' error
messages, then the problem is some odd network configuration - partial
drop-outs and/or high jitter.
Unfortunately -x will not be a solution here as slew cannot possibly
correct for a drift as big as 2 in every 900 seconds.
You may want to try just a single upstream ntp server as a debugging
step (identify it by IP, not by a pool DNS record) and/or use the prefer
keyword against your favourite.
Cheers,
Rob Middleton.