On 4/7/12, Stan Hoeppner stan@hardwarefreak.com wrote:
Firstly, thanks for the comprehensive reply. :)
I'll assume "networked storage nodes" means NFS, not FC/iSCSI SAN, in which case you'd have said "SAN".
I haven't decided on that but it would either be NFS or iSCSI over Gigabit. I don't exactly get a big budget for this. iSCSI because I planned to do md/mpath over two separate switches so that if one switch explodes, the email service would still work.
Less complexity and cost is always better. CPU throughput isn't a factor in mail workloads--it's all about IO latency. A 1U NFS server with 12 drive JBOD is faster, cheaper, easier to setup and manage, sucks less juice and dissipates less heat than 4 1U servers each w/ 4 drives.
My worry is that if that one server dies, everything is dead. With at least a pair of servers, I could keep it running, or if necessary, restore the accounts on the dead servers from backup, make some config changes and have everything back running while waiting for replacement hardware.
I don't recall seeing your user load or IOPS requirements so I'm making some educated guesses WRT your required performance and total storage.
I'm embarrassed to admit I don't have hard numbers on the user load except the rapidly dwindling disk space count and the fact when the web-based mail application try to list and check disk quota, it can bring the servers to a crawl. My lame excuse is that I'm just the web dev who got caught holding the server admin potato.
is nearly irrelevant for a mail workload, you can see it's much cheaper to scale capacity and IOPS with a single node w/fat storage than with skinny nodes w/thin storage. Ok, so here's the baseline config I threw together:
One of my concern is that heavy IO on the same server slow the overall performance even though the theoretical IOPS of the total drives are the same on 1 and on X servers. Right now, the servers are usually screeching to a halt, to the point of even locking out SSH access due to IOWait sending the load in top to triple digits.
Some host failure redundancy is about all you'd gain from the farm setup. Dovecot shouldn't barf due to one NFS node being down, only hiccup. I.e. only imap process accessing files on the downed node would have trouble.
But if I only have one big storage node and that went down, Dovecot would barf wouldn't it? Or would the mdbox format mean Dovecot would still use the local storage, just that users can't access the offloaded messages?
Also, I could possibly arrange them in a sort of network raid 1 to gain redundancy over single machine failure.
Now you're sounding like Charles Marcus, but worse. ;) Stay where you are, and brush your hair away from your forehead. I'm coming over with my branding iron that says "K.I.S.S"
Lol, I have no idea who Charles is, but I always feel safer if there was some kind of backup. Especially since I don't have the time to dedicate myself to server administration, by the time I notice something is bad, it might be too late for anything but the backup.
Of course management and clients don't agree with me since backup/redundancy costs money. :)