[Dovecot] Time moved backwards

Bill Cole dovecot-20061108 at billmail.scconsult.com
Thu May 15 02:02:56 EEST 2008


At 10:20 PM +0400 5/14/08, Eugene wrote:
>Hi people,
>
>>From: Adam McDougall <mcdouga9 at egr.msu.edu>
>>I would just like to mention a circumstance that happened to me this
>>Sunday.  We had a total power outage in our building, longer than our
>>UPS's could last and we don't have a generator for servers (nor is it
>>economical or needed).  When the power came back on, my local NTP server
>>came on at the same time as my mail servers, as well a majority of my
>>other servers.  My servers tried to step their time to be in sync with
>>my local NTP server, which was still busy trying to sync itself with
>>outside sources, which takes a while, so my mail servers did not get an
>>answer.  Later, dovecot died because the time finally synced, and I
>>found out why pretty quick (have seen this before) but this was an
>>unusual situation.
>>
>>My point is, we had an unusual circumstance, and even though I've taken
>>steps to have my mail servers sync their time at boot and run ntpd
>>afterwards, there are some circumstances in which this is not enough,
>>and dovecot still died.  Its not always because someone was lazy about
>>their time setup.
>
>My point exactly. It's amazing how some people are quick to ramble 
>about someone else's administrative incompetence without taking time 
>to read the situation.

I most certainly did read your description of the situation, and my 
use of the phrase "administrative incompetence" should not be taken 
personally. I did not say (or mean) "administrator incompetence" and 
would not try to make that sort of judgment at a distance.

>  (One person even suggested hacking the dovecot startup script to 
>run ntpdate -- useless as ntpd already occupies the ports).

That's one of the things that "ntpdate -u" is good for.


>Fact is, ntpd can take unpredictable delay before the initial 
>time-step. Delay that can't be controlled, and it would be 
>unreasonable to delay starting mail services until it is guaranteed 
>to complete. Then, dovecot dies, and admin (who is not always 
>immediately available) has to start it manually anyway (especially 
>as it is not clear what to do with possibly unsynced timestamps) -- 
>only after the unnecessary downtime.

Or you can have an external watchdog that re-launches Dovecot if it 
dies. This approach handles a broader set of failure modes and on 
some OS's is a built-in feature of the startup subsystem.

Because of the fact that Dovecot may be running in an environment 
with an external watchdog, perhaps one like launchd or classical 
SysV/Solaris init that can catch the exit of the process it spawned 
and use it to trigger an immediate respawn. This means that adding an 
internal respawn inside Dovecot that will not cause breakage on any 
system is not as simple as it may seem.

>So, the question is: why on earth can't we add a single line of code 
>to dovecot to restart itself after terminating?

You can do just that yourself if you believe that it is the best 
option for your circumstances and adequate to handle the problem you 
are having. One line of code might well do the trick you want on your 
system. If Timo puts the functionality in the code he distributes, it 
will need to be a great deal more than one line of code.

>Kind of reminds me of the "fsck_y_enable=YES" option in rc.conf. 
>Without it, if fsck does not like someting during reboot, the server 
>would just sit there in single-user prompt, waiting for (expensive) 
>console operations.

Which is actually the right choice in some circumstances.


-- 
Bill Cole                                  
bill at scconsult.com



More information about the dovecot mailing list