[Dovecot] "Time just moved backwards" in Dovecot in a Xen DomU

Patrick Domack patrickdk at patrickdk.com
Tue Oct 6 16:39:47 EEST 2009


This reminds me of an odd issue I had also, where mine stepped at a  
given amount per time too. In the datacenter one server was at limited  
it to 10mbit half duplex, and I had endless ntp issues. I could only  
replicate this offsite with the same server using 10mbit and fully  
saturating the network. Switching to Full duplex almost solved the  
issue.

But the real issue was the time clock chosen by the freebsd kernel in  
this case, APCI, was unreliable on that motherboard. Switching it to a  
different timing method fixed the issue (TSC in this case).

In freebsd (default):
kern.timecounter.choice: TSC(-100) ACPI-safe(1000) i8254(0) dummy(-1000000)
kern.timecounter.hardware: ACPI-safe

I am not sure what the commands are in linux. I haven't had ntp go  
nuts on a linux system so far.


Quoting Rob Middleton <robm-dovecot at centenary.org.au>:

> On 6/10/2009 12:54 PM, PGNet Dev wrote:
>> <snip - from dom0>
>> looking at my ntp logs around the same time(s).
>>
>>  ...
>>  5 Oct 16:41:17 ntpd[5696]: synchronized to 64.125.78.85, stratum 1
>>  5 Oct 16:51:38 ntpd[5696]: time reset -2.140133 s
>>  5 Oct 16:56:40 ntpd[5696]: synchronized to 66.220.9.122, stratum 1
>>  5 Oct 17:01:28 ntpd[5696]: synchronized to 64.125.78.85, stratum 1
>>  5 Oct 17:07:20 ntpd[5696]: time reset -2.137760 s
>>  5 Oct 17:11:49 ntpd[5696]: synchronized to 204.152.184.72, stratum 1
>>
> This indicates that ntpd is actually stepping the time 2 seconds  
> into the past approx every 900 seconds. So dovecot is correct that  
> time has moved backwards. You need to stop time moving backwards :-).
> [so not dovecot's fault, and likely not xen's fault either]
>
> I'm no ntp expert, but I wonder if searching for 900s in the ntpd  
> man page might help (caught my eye due to the step every 15 minutes  
> - network congestion and excessive jitter causing stepping)?  
> Otherwise perhaps a problem with a bad hardware driver stalling in  
> the middle of an interrupt occasionally. Sorry - can't provide any  
> further pointers. It is highly dependent on your hardware, kernel &  
> drivers. If you have any other physical servers and they are also  
> having 'time reset' error messages, then the problem is some odd  
> network configuration - partial drop-outs and/or high jitter.
>
> Unfortunately -x will not be a solution here as slew cannot possibly  
> correct for a drift as big as 2 in every 900 seconds.
>
> You may want to try just a single upstream ntp server as a debugging  
> step (identify it by IP, not by a pool DNS record) and/or use the  
> prefer keyword against your favourite.
>
> Cheers,
> Rob Middleton.
>





More information about the dovecot mailing list