DBA’s tend to get pulled in lots of different directions in a company…performance tuning, database design, systems administration, networking, report writing; the list goes on and on. With all of the diverse tasks that a DBA must tend to every day, sometimes we forget the very core responsibility of a database administrator: protecting data. One of the fundamental aspects of protecting data is planning for disasters.
When most people think of disasters, they think large scale:
- Fire in the Data Center
- Widespread Power Outage
When it comes to disasters, there are many smaller scale problems that can cause an outage that are just as critical to the data that we’re responsible for:
- Server Blue Screen of Death
- Failed Disk Drive
- Corrupt Database or Corrupt Database Page
- SQL Injection Attacks
- The DELETE Clause When the User Forgot to Add a WHERE Clause (whoops, I just deleted all sales data!)
Microsoft SQL Server comes with a variety of high availability and disaster recovery technologies (database backups and restores, Clustering, Mirroring, Log Shipping, Replication, SQL 2012 AlwaysOn Availability Groups) to help us prepare for the worst. Each of these technologies helps us recover from specific disaster scenarios, and many of them can be combined to prepare for multiple scenarios.
Much thought and planning needs to be taken to develop a disaster recovery plan that meets the recoverability needs and budget of the organization. Here are some key questions to ask yourself and business users when planning a DR strategy:
- How critical is each database to the business?
- How long can the business survive without each database? 1 minute? 8 hours? 2 days?
- Where are database backups stored? What happens if this location is unavailable?
- When is the last time I tested my database backups to make sure they are valid?
- Do I know how to restore the system databases if the server needs rebuilt? When is the last time I practiced this?
- If I get hit by a bus, are there simple, easy to follow instructions that someone else can follow to get back online?
- Does every user really need SysAdmin or Database Owner privileges?!?
Having an outline of the recoverability requirements and their associated costs will drive the DR plan that you design. For example, failover clustering is more complex and costly than log shipping. But if your primary server has a blue screen of death, you will be back online much more quickly with less data loss with a failover cluster setup. Can your business afford a couple hours’ worth of downtime or lose 15-60 minutes of data?
The summary chart below can be used to help compare the available options along with relative costs:
Chart Adapted from Database Mirroring in SQL Server 2005, Table 15: Comparing SQL Server 2005, http://technet.microsoft.com/library/Cc917680?
A good DR plan will target each type of disaster and what steps need to be taken to best recover with the least impact to users and the least risk for data loss. Don’t wait until a disaster strikes to start building your DR plan. A real disaster can be a pretty stressful time. As the protectors of the data, we need to be cool, calm, and collected when facing disasters. Have a pre-established plan will help alleviate a lot of pressure so that you make the best decisions to recover data efficiently.
Thank you for reading my first RDX blog post! Stay tuned for more posts soon.