Re: maintainer-feedback requested: [Bug 280929] mail/dovecot move bogus warning "Time moved forwards" to debug

24 Aug 2024 · *actually*


      I'm speaking as stenn@ntp.org here, but I'm not subscribed to this list
via that email address.
On 8/23/2024 7:06 PM, Jochen Bern via dovecot wrote:
...
On 21.08.24 11:35, Timo Sirainen wrote:
...
...
[Lots and lots of "but my NTP sync is much more precise than that" in
the FreeBSD thread]
The way Dovecot works is:
- It finds the next timeout, sees that it happens in e.g. 5
milliseconds.
- Then it calls kqueue() to wait for I/O for max 5 milliseconds
- Then it notices that it actually returned more than 105 milliseconds
later, and then logs a warning about it.
I think that more information is needed to pinpoint possible causes, and
one of the open questions is: What clock does dovecot look at to
determine how long it *actually* stayed dormant? On Linux, software that
has need of a monotonously increasing "time" to derive guaranteed unique
IDs from often looks at the kernel uptime - which is essentially a count
of ticks since bootup, and *not* being corrected by NTP.
Similarly, it should be determined whether the timeouts of I/O function
called (i.e., kqueue()) are or aren't influenced by NTP's corrections to
system time.
The third information I'd like to have is what client software provides
that NTP sync to the machine; ntpd, chronyd, something else?
(As an example for why this is relevant: Several hundred deviations of
100 ms or more per day sum up to several 10+ seconds per day, if only
they all are in the same direction, or several 115+ ppm.
Forward step or slew adjustments should be no problem.
Backward adjustments must be slewed, to keep time monotonic.
...
ntpd refuses to
do *slews* correcting by more than 500 ppm;
This is news to me.
See
https://www.ntp.org/documentation/4.2.8-series/ntpd/#command-line-options
for more information.
See the docs for -g and -x, for example.
Also see https://www.ntp.org/documentation/4.2.8-series/ntp.conf/ and
the 'panic', 'step', and 'stepback' options.
If what we offer does not satisfy your requirements, please let me know
and we'll find a way to improve things.
...
if the OS clock's frequency
error exceeds that, ntpd would need to do *steps* every now and then,
and in a default configuration, an ntpd will refuse to do a *second*
step and *die* instead.
That is not ntpd's default behavior, but it does happen if the -g option
is present.  I have ideas on how to address this, probably in the
upcoming ntp-4.4 release.
Again, forward steps should not be a problem for dovecot, and backward
adjustments can be forced to be slewed.
...
Or, if the reference clock sways *back and
forth*, ntpd should very likely complain about its sources' jitter in
the logs. chronyd, however, is more ruthless in whacking the local clock
into "sync" with the external sources, and much more inclined to define
"sync" as "low difference", rather than also taking frequency stability
into account like ntpd.)
My understanding of what Miroslav told me is that chronyd picks a source
of time and tracks it as best and quickly as it can, and at some point
may pick a new source.
Ntpd identifies "correct time" as best it can, from a useful number of
qualified sources.  It does this *well*, and ntpd will take its time to
make sure this happens in a stable and predictable way.  Ntpd drives to
"correct time", which may be in the "middle" of the set of qualified
targets.
...
...
Also, this is kind of a problem when it does happen. Since Dovecot
thinks the time moved e.g. 100ms forward, it adjusts all timeouts to
happen 100ms backwards. If this wasn't a true time jump, then these
timeouts now happen 100ms earlier.
That is, of course, a dangerous approach if you do *not* have a
guarantee that the timeouts of the I/O function called are *otherwise*
true to the requested duration. But shouldn't those other concurrently-
running timeouts notice an actual discontinuity of the timescale just
the same as the first one did? Maybe some sort of "N 'nay's needed for a
vote of nonconfidence" mechanism would be safer ...
Important stuff, and Difficult to do with current APIs.
...
Kind regards,
H

Re: maintainer-feedback requested: [Bug 280929] mail/dovecot move bogus warning "Time moved forwards" to debug

Harlan Stenn