At 11:10 AM +0200 6/20/08, Anders wrote:
Johannes Berg <johannes@sipsolutions.net> writes:
On Fri, 2008-06-20 at 10:53 +0200, Anders wrote:
I was puzzled that it was always 4398 seconds, in particular because this server runs an NTP daemon. A little searching for this problem shows that it is an issue with the Linux kernel gettimeofday(), see e.g. http://lkml.org/lkml/2007/8/23/96
The thread puts it down to buggy hardware and puts a workaround into the kernel where it belongs, not in dovecot.
I think it is more accurate to say "hardware being used for a purpose its designers did not intend" instead. Using the TSC as a clock has been iffy for quite some time, and defaulting to it in the kernel is a risky design choice and must be implemented with extreme caution. It's not that the hardware is buggy,but rather that it does things by design that are not obvious from a high-level description.
That's not helpful.
By that line, the entire "time moved backwards" thing does not belong in Dovecot.
I suspect that you don't understand why that is in Dovecot. Timo has explained it in detail a few times, but the bottom line is simple: running through the same system-clock time more than once induces a very real risk of destroying mail.
Anyway, I was not proposing the patch to be included, just asking for advice as to whether it would be safe. I even noted that it was ugly.
"Safe" is subjective. I think it would be safer (at the cost of a bounded amount of time) to nanosleep or maybe usleep once and retry the call rather than to go into the loop.
As I am already compiling Dovecot myself, I prefer a patch there, rather than diverting from the distribution kernel.
You might even be better off configuring your system to not use the TSC as a clock source. There's a strong chance that you won't really be sacrificing anything that you actually use.
-- Bill Cole bill@scconsult.com