Re: [Dovecot] "Time just moved backwards" in Dovecot in a Xen DomU

6 Oct 2009

      This reminds me of an odd issue I had also, where mine stepped at a

given amount per time too. In the datacenter one server was at limited

it to 10mbit half duplex, and I had endless ntp issues. I could only

replicate this offsite with the same server using 10mbit and fully

saturating the network. Switching to Full duplex almost solved the

issue.
But the real issue was the time clock chosen by the freebsd kernel in

this case, APCI, was unreliable on that motherboard. Switching it to a

different timing method fixed the issue (TSC in this case).
In freebsd (default):
kern.timecounter.choice: TSC(-100) ACPI-safe(1000) i8254(0) dummy(-1000000)
kern.timecounter.hardware: ACPI-safe
I am not sure what the commands are in linux. I haven't had ntp go

nuts on a linux system so far.
Quoting Rob Middleton <robm-dovecot@centenary.org.au>:
...
On 6/10/2009 12:54 PM, PGNet Dev wrote:
...
<snip - from dom0>
looking at my ntp logs around the same time(s).
...
5 Oct 16:41:17 ntpd[5696]: synchronized to 64.125.78.85, stratum 1
5 Oct 16:51:38 ntpd[5696]: time reset -2.140133 s
5 Oct 16:56:40 ntpd[5696]: synchronized to 66.220.9.122, stratum 1
5 Oct 17:01:28 ntpd[5696]: synchronized to 64.125.78.85, stratum 1
5 Oct 17:07:20 ntpd[5696]: time reset -2.137760 s
5 Oct 17:11:49 ntpd[5696]: synchronized to 204.152.184.72, stratum 1
This indicates that ntpd is actually stepping the time 2 seconds

into the past approx every 900 seconds. So dovecot is correct that

time has moved backwards. You need to stop time moving backwards :-).
[so not dovecot's fault, and likely not xen's fault either]
I'm no ntp expert, but I wonder if searching for 900s in the ntpd

man page might help (caught my eye due to the step every 15 minutes

network congestion and excessive jitter causing stepping)?

Otherwise perhaps a problem with a bad hardware driver stalling in

the middle of an interrupt occasionally. Sorry - can't provide any

further pointers. It is highly dependent on your hardware, kernel &

drivers. If you have any other physical servers and they are also

having 'time reset' error messages, then the problem is some odd

network configuration - partial drop-outs and/or high jitter.

Unfortunately -x will not be a solution here as slew cannot possibly

correct for a drift as big as 2 in every 900 seconds.
You may want to try just a single upstream ntp server as a debugging

step (identify it by IP, not by a pool DNS record) and/or use the

prefer keyword against your favourite.
Cheers,
Rob Middleton.

Re: [Dovecot] "Time just moved backwards" in Dovecot in a Xen DomU

Patrick Domack