This went to me only so bringing back on list.
On 1/24/2014 11:09 AM, Tom Johnson wrote:
Is anybody using the Object Storage plugin for large-scale installations?
I've not used it.
We're considering it, but are thinking of an in-house S3 storage system (riak, or ceph, or ?) Looking to support perhaps 300k users. I was thinking that if we use a bank of dovecot servers (with director) with ssds as cache, we might be able to consolidate all the storage on something like a riak cluster, which would make scaling simple and inexpensive - certainly much less than a NetApp solution.
Everything costs less than a NetApp...except an EMC.
If anyone has any first-hand experience (or even off-the-top-of-their-head thoughts), I'd love to hear them)
Distributed filesystems give you the advantage of a single filesystem namespace with massive amounts of storage, fairly easy addition of storage space, and distributed replication to allow failure of a storage node without service interruption.
Replication mitigates node failure, but not disk failure, so you still need RAID in each node. So you have RAID6 in a node and filesystem block mirroring amongst nodes. Thus storage utilization is -worse- than direct attach, CFS on SAN, or NFS head attached RAID10 and far worse than RAID6 in these 3 setups. And if using large SSD cache you'd surely use RAID6 with DAS, CFS, or NFS. You'd need half as many disk drives vs DFS.
Each DFS expansion, assuming the typical model, entails the cost of a server, RAID HBA (unless using md) and disks, not strictly buying disks as with DAS, CFS/SAN, or NFS filer. Then you also need more switch ports, more power connections, greater UPS capacity due to all the CPUs, RAM, etc in the nodes. And you'll have a higher electric bill.
So while a distributed filesystem storage architecture may seem less expensive it may not be. And just as one can build a DIY DFS cluster, one can also build a DIY NFS cluster instead of buying a NetApp, saving significant cash on the front end box and on disks since you'd need half as many vs a distributed filesystem architecture, though failure of one node may not be quite as graceful as with a NetApp losing a controller board.
-- Stan