On Jul 1, 2008, at 12:51 AM, Timo Sirainen wrote:
Is this already a known problem? Should the start-up logic be made more robust (e.g. check whether a process corresponding to the PID actually exists)?
It already checks if the PID exists, but it doesn't check what that process is (and I don't think there is a portable way to do it
anyway). I don't think it's too much to ask to delete the master.pid if in rare situations it fails to start due to a PID conflict.
This is a pet peeve of mine for many services started at boot time.
Since the ordering of service startup is usually fairly static, a
*LOT* of times process IDs are nearly identical on boot. Depending on
which way they go, if they drift towards earlier, you'll have the PID
in use. This drove me NUTS with Sun's LDAP server.
Many recent OSes are now using memory-based filesystems for /var/run,
or otherwise clear out /var/run at boot time. But if a process stores
its PID somewhere else, you're SOL (much like Sun One Directory Server
does).
The problem with having to remove a master.pid file on boot is that
you might have a BUNCH of clients or customers that are using your
system, and you're either asleep at 3am when the server kicked over,
or in another state. It's not a problem if you have staff watching
machines reboot. ;-)
Sorry, had to kibitz.
Sean
PS I often times add a 'rm $PID' line in the init.d script, and let a
server die because it couldn't bind to the port. That doesn't work
with everything, though.