On Sun, 2006-05-07 at 16:56, Udo Rader wrote:
On Sun, 2006-05-07 at 15:44 -0400, Charles Marcus wrote:
Yes, you are right, the cause for this incident was faulty memory and I don't blame reiserfs for failing due to this. But the effect was a unrepairable filesystem and that again was a problem with the repair tools available then.
Not necessarily... faulty memory could cause corruption that NO file system repair tools could repair.
Hmm, what kind of corruption should that be? I was not talking about individual files being lost but an entire partition being inaccessible.
Everything that's on the disk was written there from a memory buffer.
At least in my naive world this is something that should never happen at all (unless the storage media breaks). AFAIK, any modern filesystem keeps backups of mission critical data like for example superblocks, but please feel free to correct me, if I am missing something here.
If the memory buffer does not retain what the OS attempted to store there, the on-disk copy isn't going to be correct either - including as many copies as you might try to make.
But in order to become on-topic again, what I was trying to say was that the quality and availability of disaster recovery tools/procedures is of major importance when choosing a FS for any server and that is where reiserfs failed at least for my part.
There's a reason that servers usually have ECC memory. You are better off having uncorrectable errors stop the machine then continuing with corruption.
-- Les Mikesell lesmikesell@gmail.com