[Dovecot] Better to use a single large storage server or multiple smaller for mdbox?

Ed W lists at wildgooses.com
Thu Apr 12 13:58:52 EEST 2012


On 12/04/2012 11:20, Stan Hoeppner wrote:
> On 4/11/2012 9:23 PM, Emmanuel Noobadmin wrote:
>> On 4/12/12, Stan Hoeppner<stan at hardwarefreak.com>  wrote:
>>> On 4/11/2012 11:50 AM, Ed W wrote:
>>>> One of the snags of md RAID1 vs RAID6 is the lack of checksumming in the
>>>> event of bad blocks.  (I'm not sure what actually happens when md
>>>> scrubbing finds a bad sector with raid1..?).  For low performance
>>>> requirements I have become paranoid and been using RAID6 vs RAID10,
>>>> filesystems with sector checksums seem attractive...
>>> Except we're using hardware RAID1 here and mdraid linear.  Thus the
>>> controller takes care of sector integrity.  RAID6 yields nothing over
>>> RAID10, except lower performance, and more usable space if more than 4
>>> drives are used.
>> How would the control ensure sector integrity unless it is writing
>> additional checksum information to disk? I thought only a few
>> filesystems like ZFS does the sector checksum to detect if any data
>> corruption occurred. I suppose the controller could throw an error if
>> the two drives returned data that didn't agree with each other but it
>> wouldn't know which is the accurate copy but that wouldn't protect the
>> integrity of the data, at least not directly without additional human
>> intervention I would think.
> When a drive starts throwing uncorrectable read errors, the controller
> faults the drive and tells you to replace it.  Good hardware RAID
> controllers are notorious for their penchant to kick drives that would
> continue to work just fine in mdraid or as a single drive for many more
> years.  The mindset here is that anyone would rather spent $150-$2500
> dollars on a replacement drive than take a chance with his/her valuable
> data.
>

I'm asking a subtlely different question.

The claim by ZFS/BTRFS authors and others is that data silently "bit 
rots" on it's own. The claim is therefore that you can have a raid1 pair 
where neither drive reports a hardware failure, but each gives you 
different data?  I can't personally claim to have observed this, so it 
remains someone else's theory...  (for background my experience is 
simply: RAID10 for high performance arrays and RAID6 for all my personal 
data - I intend to investigate your linear raid idea in the future though)

I do agree that if one drive reports a read error, then it's quite easy 
to guess which pair of the array is wrong...

Just as an aside, I don't have a lot of failure experience.  However, 
the few I have had (perhaps 6-8 events now) is that there is a massive 
correlation in failure time with RAID1, eg one pair I had lasted perhaps 
2 years and then both failed within 6 hours of each other. I also had a 
bad experience with RAID 5 that wasn't being scrubbed regularly and when 
one drive started reporting errors (ie lack of monitoring meant it had 
been bad for a while), the rest of the array turned out to be a 
patchwork of read errors - linux raid then turns out to be quite fragile 
in the presence of a small number of read failures and it's extremely 
difficult to salvage the 99% of the array which is ok due to the disks 
getting kicked out... (of course regular scrubs would have prevented 
getting so deep into that situation - it was a small cheap nas box 
without such features)

Ed W




More information about the dovecot mailing list