On 3/15/2012 5:51 AM, Charles Marcus wrote:
On 2012-03-01 8:38 PM, Stan Hoeppner <stan@hardwarefreak.com> wrote:
Get yourself a qualified network architect. Pay for a full network traffic analysis. He'll attach sniffers at multiple points in your network to gather traffic/error/etc data. Then you'll discuss the new office, which employees/types with move there, and you'll be able to know almost precisely the average and peak bandwidth needs over the MAN link. He'll very likely tell you the same thing I have, that a single gigabit MAN link is plenty. If you hire him to do the work, he'll program the proper QOS setup to match the traffic patterns gleaned from the sniffers.
Finally had time to properly review your answers here Stan.
The time you took for the in-depth reply is very much appreciated - and
Multi-site setups can be tricky as they often temp folks to do unnecessary things they otherwise would not. Just trying to help keep your sails pointed in the right direction. :) #1 rule when building a multi-site network: only duplicate hardware and services at the remote site(s) when absolutely necessary.
I'm sure you got a kick out of the level of my ignorance... ;)
Not at all. I'm sure there is some subject or another where you would demonstrate my ignorance. From another perspective, if there was no ignorance left on the planet then there would be nothing left for anyone to learn. That would make for a boring world.
As for hiring a network architect, I will absolutely be doing as you recommend (was already planning on it), but with the information I'm now armed with, at least I'll have a better chance of knowing if they know what they are doing/talking about...
Now that you are aware of network analysis using sniffers, allow me to throw you a curve ball. For a network of your size, less than 70 users IIRC, with a typical application mix but with SMB/NFS traffic/file sizes a little above 'average', a qualified engineer probably won't need to plug sniffers into your network to determine the size MAN pipe and what traffic shaping you'll need. He'll have already done a near identical setup dozens of times. The good news is this saves you a few grand. Analysis with sniffers ain't cheap, even for small networks. And sniffers are normally only deployed to identify the cause of network problems, not very often for architectural or capacity planning. But, asking him about doing a full analysis using sniffers, and hearing his response, may lead to a valuable discussion nonetheless.
Have your MAN and internet providers' (if not the same company) pricing sheet(s) in hand when you meet with the engineer. Depending on fast ethernet MAN, GbE MAN, and internet pipe pricing, he may have some compelling options/recommendations for you, possibly quite different, less costly, and more redundant than what you have been considering up to this point.
I'm still planning for the two physical servers (one at each location),
Again, if you don't _need_ hardware and services at the 2nd site to achieve the current service level at the primary site, do not add these things to the 2nd site. I really want to put a bunch of exclamation points here but I hate exclamation points in technical emails--actually I just hate them, period. ;)
but you have convinced me that trying to run two live mail systems is an unnecessary and even unwanted level of complexity.
Running an active/active Dovecot cluster doesn't guarantee an unnecessary nor unwanted additional complexity. The need for clustering should go through a justification process just like anything else: what's the benefit, total 'cost', what's the ROI, etc. Lots of people here do active/active clustering every day with great success. Connecting the cluster nodes over a MAN link, however, does introduce unnecessary complexity. Locating one node in another building many blocks away is unnecessary. Putting the nodes in the same rack/room is smart, and easily accomplished in your environment, gives you the redundancy above, but without the potentially problematic MAN link as the cluster interconnect. Granted you'll need to build two new (preferably identical) systems from scratch and setup shared storage (DRBD or a SAN array) and GFS2 or OCFS, etc. Given your environment, there are only two valid reasons for locating equipment and duplicating data and services at a remote site:
- Unrecoverable network failure (due to single MAN link)
- Unrecoverable primary site failure (natural or man made disaster)
#1 is taken care of by redundant MAN links #2 you've never planned for to this date (probability is *low*) and you need _everything_ duplicated at the remote site
Duplicating servers for high(er) user throughput/lower latency to/from servers isn't a valid reason for remote site duplication in your case because you are able to afford plenty of bandwidth and link redundancy between the sites. The relative low cost and high bandwidth of the MAN link outweighs any benefit of service replication due to the latter's complexity level.
Here are some other 'rules':
- Don't duplicate servers at remote sites to mitigate network link failure when sites are close and redundant bandwidth is afforadable
- Do duplicate network links to mitigate link failure when sites are close and bandwidth is affordable
- Implement and test a true disaster avoidance and recovery plan
The DC VM will still be hot (it is always best to have two DCs in a windows domain environment anyway) so I'll get automatic real time off site backup of all of the users data (since it will all be on DFS), but for the mail services, I'll just designate one as live, and one as the hot/standby that is kept in sync using dsync. This way I'll automatically get off site back up for each site for the users data stored in the DFS, and have a second mail system ready to go if something happens to the primary.
Again, you're not looking at this network design from the proper perspective. See rules 1-3 above.
Off site backups/replication are used exclusively to mitigate data loss due to catastrophic facility failure, not server failure, enabling rapid system recovery when new equipment has arrived. Many business insurers have catastrophic IT equipment replacement plans and relationships with the big 5 hardware vendors, enabling you to get new new equipment racked and begin your restore from offsite tape, within as little as 24 hours of notification.
Think of how FEMA stages emergency supplies all around the country. Now think 10 times better, faster. Such services increase your premiums, but if you're serious about disaster avoidance and recovery, this is the only way to go. IBM, HP, maybe Dell, Sun (used to anyway), have dedicated account reps for disaster recovery. They work with you to keep an inventory of all of your systems and storage. Your records are constantly updated when your products are EOL'd or superseded or you replace or add hardware, and a list is maintained of current hardware best matched to replace all of your now burned, flooded, tornado shredded, hurricane blasted equipment, right down to bare metal restore capability, if possible/applicable.
You plan to replicate filesystem user data and mailbox data to a 2nd site to mitigate single server failures. Why does that need to be done to an offsite location/system? It doesn't. There is no benefit whatsoever. You can accomplish this in the same rack/room and get by with a smaller MAN pipe saving time, money, and administrative burden. The restore procedure will be faster if all machines are in the same rack/room and you're using tape, and you won't slow users down with restore traffic going over the MAN link.
If you really want off-site backup, for what it's meant to accomplish, get a network attached tape library/silo, or a speedy high cap LTO-4/5 tape drive in each server, put a real backup rotation and restore plan in place, and store backup tapes in a secure facility. A remote "hot site" is great when it's in a different city, better yet region, or in a hardened facility in any locale. Your hot site is only a few blocks away. If your primary site it taken out by anything other than fire, such as a tornado, earthquake, hurricane being more likely in your case, chances are your hot site may go down soon after the primary. If you want/need a real off site backup solution, rotate tapes to an everything-proof facility. Here are 3 companies in the Atlanta area that offer media rotation storage services. Watch the Offsite Tape Vaulting video at IronMountain:
http://www.ironmountain.com/Knowledge-Center/Reference-Library/View-by-Docum...
http://www.askads.net/media-rotation/ http://www.adamsdatamanagement.com/tape-rotation-atlanta-ga.htm
Again, thanks Stan... I am constantly amazed at the level of expertise and quality of advice available *for free* in the open source world, as is available on these lists.
Always glad to assist my brethren in this digital kingdom. Whichever architecture/topology you choose, remote replicated systems or not, I hope my input has given you some good information on which to base your decisions.
-- Stan