[Dovecot] Return error instead of dying on time back skip?
Hello everybody!
Currently, dovecot just kills itself if it detects that time has moved backwards more than a hardcoded number of seconds. I accept the reasons, but I do not like to restart dovecot manually after waiting for time to move forward again. A cron job would not help, because time might still be wrong when it restarts dovecot.
All our systems run ntpd, but they might be offline for a while before they get contact to a time server, e.g. because of DSL problems. When they do get contact and time is too far off, ntpd sets the new time directly (yes, it could gradually do that, but it might take ages).
Now I wonder if Dovecot could return errors to the users instead of dying until time is fine again, e.g. "System time has moved backwards, please come back in n seconds". If the time skip is just a few seconds, it can of course delay and then go on as it does now.
With this change, no admin would be needed to carefully restart Dovecot at the right time. I have not looked into Dovecot code myself yet, but could try a patch if necessary.
Amon Ott
Amon Ott - m-privacy GmbH Am Köllnischen Park 1, 10179 Berlin Tel: +49 30 24342334 Fax: +49 30 24342336 Web: http://www.m-privacy.de Handelsregister: Amtsgericht Charlottenburg HRB 84946 Geschäftsführer: Dipl.-Kfm. Holger Maczkowsky, Roman Maczkowsky GnuPG-Key-ID: EA898571
Amon Ott wrote:
All our systems run ntpd, but they might be offline for a while before they get contact to a time server, e.g. because of DSL problems. When they do get contact and time is too far off, ntpd sets the new time directly (yes, it could gradually do that, but it might take ages).
You might want to consider using clockspeed:
http://cr.yp.to/clockspeed.html
instad of ntpd, since clockspeed handles the underlying clock skew in a more robust fashion. This software needs only a couple of connections to figure out how bad the underlying hardware clock skews and automatically adjusts it in a smooth fashion.
If you haven't used Dan Bernstein's software before, you will probably want to use clockspeed-conf:
http://foo42.de/devel/sysutils/clockspeed-conf/
to manage the daemons.
John
-- John Peacock Director of Information Research and Technology Rowman & Littlefield Publishing Group 4501 Forbes Boulevard Suite H Lanham, MD 20706 301-459-3366 x.5010 fax 301-429-5748
Hello Amon,
Amon Ott, 02.05.2007 (d.m.y):
All our systems run ntpd, but they might be offline for a while before they get contact to a time server, e.g. because of DSL problems.
Define one of your "internal" systems as "master time server" that connects to other NTP servers outside your networks and make your other systems synchronize their system time with this machine.
Gruss/Regards, Christian Schmidt
-- Writing is easy; all you do is sit staring at the blank sheet of paper until drops of blood form on your forehead. -- Gene Fowler
On Friday 04 May 2007 10:07, Christian Schmidt wrote:
Hello Amon,
Amon Ott, 02.05.2007 (d.m.y):
All our systems run ntpd, but they might be offline for a while before they get contact to a time server, e.g. because of DSL problems.
Define one of your "internal" systems as "master time server" that connects to other NTP servers outside your networks and make your other systems synchronize their system time with this machine.
Thanks for the tip, but these are customer systems. Most of them only buy one server system from us and have Windows clients behind. So there is no other system available, which could be master. Instead, our server is supposed to be time master. :)
From customer point of view, the mail server does not work without reason. An error message would help to understand the problem and the proposed behaviour would keep dovecot running. Now some of them just reboot the system (Windows users...), this is not the desired solution.
Amon Ott
Amon Ott - m-privacy GmbH Am Köllnischen Park 1, 10179 Berlin Tel: +49 30 24342334 Fax: +49 30 24342336 Web: http://www.m-privacy.de Handelsregister: Amtsgericht Charlottenburg HRB 84946 Geschäftsführer: Dipl.-Kfm. Holger Maczkowsky, Roman Maczkowsky GnuPG-Key-ID: EA898571
On Wed, 2007-05-02 at 10:52 +0200, Amon Ott wrote:
Now I wonder if Dovecot could return errors to the users instead of dying until time is fine again, e.g. "System time has moved backwards, please come back in n seconds". If the time skip is just a few seconds, it can of course delay and then go on as it does now.
With this change, no admin would be needed to carefully restart Dovecot at the right time. I have not looked into Dovecot code myself yet, but could try a patch if necessary.
I think this is just way too much trouble for handling a situation that really shouldn't be happening in the first place.. The code already allows the clock to move backwards by 5 seconds without dying, so how horrible are the clocks in those computers? :)
It's anyway easy to increase that time by modifying the sources. src/lib/ioloop.c IOLOOP_MAX_TIME_BACKWARDS_SLEEP
On Fri, 11 May 2007 14:50:54 +0300 Timo Sirainen <tss@iki.fi> wrote:
The code already allows the clock to move backwards by 5 seconds without dying, so how horrible are the clocks in those computers? :)
Clock drift of about 13 seconds/day (150 PPM) is (unfortunately) not uncommon, and 4-6 seconds/day (50-75 PPM) is about the norm for PC hardware in my experience.
Of course, this is exactly the reason why you should run ntpd instead of ntpdate on a cron job (especially a once-per-day cron job...)
-- Ben Winslow <rain@bluecherry.net>
Ben Winslow wrote:
Clock drift of about 13 seconds/day (150 PPM) is (unfortunately) not uncommon, and 4-6 seconds/day (50-75 PPM) is about the norm for PC hardware in my experience.
Of course, this is exactly the reason why you should run ntpd instead of ntpdate on a cron job (especially a once-per-day cron job...)
I would again recommend clockspeed:
http://cr.yp.to/clockspeed.html
http://foo42.de/devel/sysutils/clockspeed-conf/
for machines which don't have continuous connection to the Internet (where [x]ntpd won't do you any good). It handily reigns in bad clock crystals with only a couple of external connections per month.
John
-- John Peacock Director of Information Research and Technology Rowman & Littlefield Publishing Group 4501 Forbes Boulevard Suite H Lanham, MD 20706 301-459-3366 x.5010 fax 301-429-5748
On 11/05/07, Timo Sirainen <tss@iki.fi> wrote:
On Wed, 2007-05-02 at 10:52 +0200, Amon Ott wrote:
Now I wonder if Dovecot could return errors to the users instead of dying until time is fine again, e.g. "System time has moved backwards, please come back in n seconds". If the time skip is just a few seconds, it can of course delay and then go on as it does now.
With this change, no admin would be needed to carefully restart Dovecot at the right time. I have not looked into Dovecot code myself yet, but could try a patch if necessary.
I think this is just way too much trouble for handling a situation that really shouldn't be happening in the first place.. The code already allows the clock to move backwards by 5 seconds without dying, so how horrible are the clocks in those computers? :)
It's anyway easy to increase that time by modifying the sources. src/lib/ioloop.c IOLOOP_MAX_TIME_BACKWARDS_SLEEP
Just to add that I've also been bitten by this upon updating from rc18 to 1.0.0. My current kludge is to restart Dovecot in a cron till I try recompiling/fixing it.
I'm running under a Virtuozzo VPS and there is no way to run ntp. The underlying server is ntp sync'd anyway and that gets passed up to the VPSs. So this shouldn't really be happening. It usually kills itself after about 40-50 minutes of running (although the time varies) so Dovecot is 'seeing' over 5 secs of slippage in an hour, which is unlikely. FWIW, I run a couple of servers with Virtuozzo and no other software has registered a problem.
cheers jalal
participants (6)
-
Amon Ott
-
Ben Winslow
-
Christian Schmidt
-
jalal
-
John Peacock
-
Timo Sirainen