[Dovecot] Better to use a single large storage server or multiple smaller for mdbox?

Stan Hoeppner stan at hardwarefreak.com
Sun Apr 15 03:05:19 EEST 2012


On 4/14/2012 5:00 AM, Ed W wrote:
> On 14/04/2012 04:48, Stan Hoeppner wrote:
>> On 4/13/2012 10:31 AM, Ed W wrote:
>>
>>> You mean those "answers" like:
>>>      "you need to read 'those' articles again"
>>>
>>> Referring to some unknown and hard to find previous emails is not the
>>> same as answering?
>> No, referring to this:
>>
>> On 4/12/2012 5:58 AM, Ed W wrote:
>>
>>> The claim by ZFS/BTRFS authors and others is that data silently "bit
>>> rots" on it's own.
>> Is it not a correct assumption that you read this in articles?  If you
>> read this in books, scrolls, or chiseled tablets, my apologies for
>> assuming it was articles.
>>
> 
> WHAT?!!  The original context was that you wanted me to learn some very
> specific thing that you accused me of misunderstanding, and then it
> turns out that the thing I'm supposed to learn comes from re-reading
> every email, every blog post, every video, every slashdot post, every
> wiki, every ... that mentions ZFS's reason for including end to end
> checksumming?!!

No, the original context was your town crier statement that the sky is
falling due to silent data corruption.  I pointed out that this is not
the case, currently, that most wouldn't see this until quite a few years
down the road.  I provided facts to back my statement, which you didn't
seem to grasp or comprehend.  I pointed this out and your top popped
with a cloud of steam.

> Please stop wasting our time and get specific

Whose time am I wasting Ed?  You're the primary person one on this list
who wastes everyone's time with these drawn out threads, usually
unrelated to Dovecot.  I have been plenty specific.  The problem is you
lack the knowledge and understanding of hardware communication.  You're
upset because I'm not pointing out the knowledge you seem to lack?  Is
that not a waste of everyone's time?  Is that not be even "more
insulting"?  Causing even more excited/heated emails from you?

> You have taken my email which contained a specific question, been asked
> of you multiple times now and yet you insist on only answering
> irrelevant details with a pointed and personal dig on each answer.  The
> rudeness is unnecessary, and your evasiveness of answers does not fill
> me with confidence that you actually know the answer...

Ed, I have not been rude.  I've been attempting to prevent you dragging
us into the mud, which you've done, as you often do.  How specific would
you like me to get?  This is what you seem to be missing:

Drives perform per sector CRC before transmitting data to the HBA.  ATA,
SATA, SCSI, SAS, fiber channel devices and HBAs all perform CRC on wire
data.  The PCI/PCI-X/PCIe buses/channels and Southbridge all perform CRC
on wire data.  HyperTransport, and Intel's proprietary links also
perform CRC on wire transmissions.  Server memory is protected by ECC,
some by ChipKill which can tolerate double bit errors.

With today's systems and storage densities, with error correcting code
on all data paths within the system, and on the drives themselves,
"silent data corruption" is not an issue--in absence of defective
hardware or a bug, which are not relevant to the discussion.

> For the benefit of anyone reading this via email archives or whatever, I
> think the conclusion we have reached is that: modern systems are now a)
> a complex sum of pieces, any of which can cause an error to be injected,

Errors occur all the time.  And they're corrected nearly all of the
time, on modern complex systems.  Silent errors do not occur frequently,
usually not at all, on most modern systems.

> b) the level of error correction which was originally specified as being
> sufficient is now starting to be reached in real systems, 

FSVO 'real systems'.  The few occurrences of "silent data corruption"
I'm aware of have been documented in academic papers published by
researches working at taxpayer funded institutions.  In the case of
CERN, the problem was a firmware bug in the Western Digital drives that
caused an issue with the 3Ware controllers.  This kind of thing happens
when using COTS DIY hardware in the absence of proper load validation
testing.  So this case doesn't really fit the Henny-penny silent data
corruption scenario as a firmware bug caused it.  One that should have
been caught and corrected during testing.

In the other cases I'm aware of, all were HPC systems which generated
SDC under extended high loads, and these SDCs nearly all occurred
somewhere other than the storage systems--CPUs, RAM, interconnect, etc.
 HPC apps tend to run the CPUs, interconnects, storage, etc, at full
bandwidth for hours at a time, across tens of thousands of nodes, so the
probability of SDC is much higher simply due to scale.

> possibly even
> consumer systems.  

Possibly?  If you're going to post pure conjecture why not say "possibly
even iPhones or Androids"?  There's no data to back either claim.  Stick
to the facts.

> There is no "solution", however, the first step is to
> enhance "detection".  Various solutions have been proposed, all increase
> cost, computation or have some disadvantage - however, one of the more
> promising detection mechanisms is an end to end checksum, which will
> then have the effect of augmenting ALL the steps in the chain, not just
> one specific step.  As of today, only a few filesystems offer this, roll
> on more adopting it

So after all the steam blowing, we're back to where we started.  I
disagree with your assertion that this is an issue that we--meaning
"average" users not possessing 1PB storage systems or massive
clusters--need to be worried about TODAY.  I gave sound reasons as to
why this is the case.  You've given us 'a couple of academic papers say
the sky is falling so I'm repeating the sky is falling'.  Without
apparently truly understanding the issue.

The data available and the experience of the vast majority of IT folks
backs my position--which is why that's my position.  There is little to
no data supporting your position.

I say this isn't going to be an issue for average users, if at all, for
a few years to come.  You say it's here now.  That's a fairly minor
point of disagreement to cause such a heated (on your part) lengthy
exchange.

BTW, if you see anything I've stated as rude you've apparently not been
on the Interwebs long. ;)

-- 
Stan


More information about the dovecot mailing list