Fill out a short form and we will mail
you the solution guides.

 

 

Thursday, September 27, 2007

5 minute recovery time objective for servers and data centers

Recovering a failed server is hard. It's so difficult to recover a server that the industry practice is to over provision spare capacity such that when one or many servers fail there are backup servers available to pick up the slack while you spend the time required to re-rack, re-cable, restore, rebuild and ultimately recover failed devices before the customer notices. In the event of a server failure, complex layers of clustering, load-balancing, fail-over, virtualization, and data replication are at the ready to provide the perception that all is well, when at best you're running in a degraded mode, and at worst you're one server away from going off-line.

Recovering a failed datacenter is hard. It's so difficult to recover a data center that the industry practice is to pre-provision a replica of your "production" data center at a secondary site which is in an active or stand-by mode, available to pick up the slack while you spend the time required recover or replace your primary data center. In the event of a site failure, complex layers of clustering, load-balancing, fail-over, virtualization, and data replication are at the ready to provide the perception that all is well, when at best you're running in a degraded mode, and at worst you're one data center away from going off-line.

Servers do fail. Events do occur which require secondary site fail-over. The IT costs of keeping a business up and running are staggering both in terms of spend and in terms of efficiency of the deployed resources. A broad industry of availability solutions has evolved to compensate for the time and complexity associated with replacing a server - whether in the same location or from one site to another.

What would you change in the way you run your business if you could reliably and predictably replace a failed server along with its connection to network and storage in under 5 minutes (boot time)? What if you could replace an entire data center full of servers?

At a recent Disaster Recovery Journal trade show, we had the opportunity to speak with a wide array of industry experts who are tasked with the responsibility of keeping their business up and running when disaster strikes. Some leverage highly sophisticated IT strategies for always-on infrastructure, while others rebuild from backups when needed. Regardless of the different approaches, they are all closely looking at the managing the cost of delivering to their RTO (Recovery Time Objective.)

When you can replace a server along with its connection to network and storage in 5 minutes, you find that the requirement for high availability overlay solutions disappears for broad categories of applications. For the remaining applications, exposure while replacing a failed server from a high availability service is reduced to 5 minutes. Across the board, over-allocation of fail-over resources to service specific silos is eliminated, and replacement devices are pooled and available at a moment's notice.

Extending 5 minute server replacement to the needs of data center fail-over, some very interesting capabilities are created. For instance, servers added at one location can be automatically added in real time at a secondary location - keeping the two locations better in synch. Servers at the "DR" site may be turned off until needed, or re-deployed for more productive uses. One of our customers runs Dev/Test in their "fail-over" location with the ability to restore to a replica of their primary datacenter at the push of a button. Environments which previously included production, fail-over, development, test, and staging can condense down to two deployments - production and non-production (everything else.)

What would you change if you could reliably and predictably replace a failed server along with its connection to network and storage in under 5 minutes? What if you could replace an entire data center full of servers? How would this impact your business continuity planning?

If you're wondering how we do it, please request our rapid failover and data center efficiency case studies from www.scalent.com/failover. Or simply give us a call; it's a conversation we love to have.

Brian Korn, Director, Marketing, September 2007

0 Comments:

Post a Comment

<< Home

 

 

 
Copyright © 2007 Scalent Systems, Inc. All Rights Reserved.