Creating a SQL Server Disaster Recovery Plan – Part 2

Before I move on there is one last issue with backups that I would like to cover, choosing someone to swap tapes. Once you have your plan in place you should make a single person to be responsible for checking that backups took place and swap out tapes as needed. It is important to have one person do this because when multiple people share the responsibility you end up with: “I thought you swapped the tapes last night?!?” When a single person is responsible for backups it becomes a routine for them. Now, if that one person is unable to swap tapes (ex: they get sick) it is their responsibility to find another person to swap tapes for them. Although another person may do the job of swapping tapes now and then, you still have the accountability of the single person who normally does the tape swapping, not a group of people. If the job is two big for one person consider giving the responsibility for half the servers to one person and half to another person (or however you would like to split them up). Additionally, if you can’t have the same person swapping tapes every day (ex: you take backups on the weekend), make it clear who is responsible for what days and keep the days the same from week to week.

So, having decided how often to backup, where to backup, and where to store our backups…what’s left in a disaster recovery plan? Well quite a lot, but most of it is highly dependent on your environment. The first step is to document, document, and document. Get a folder and dedicate it to your disaster recovery plan. Here are some things that you should include in your plan:

  • Server Hardware Specifications
  • Network Layout
  • Server Software Configurations
  • Database File Layout (i.e. log files and data files)
  • Label your tapes and include a backup and rotation description

The next step is to start thinking about, and write down, what should happen if a failure occurs. Keep in mind when you start to write out the plan, you should assume that you are not on-site and are unable to come to the rescue. You should also assume that the person restoring the server does have technical knowledge about SQL Server, but knows nothing about your particular setup. Think about things like:

  • Who should be contacted if something goes wrong?
  • Where are the backups stored?
  • Where are the software and driver disks stored?
  • Are their any tech support numbers available?
  • If new hardware is required what should be done?
  • Is their any other information that may be useful?

Once you have completed documenting what should happen if a disaster occurs, there is one final step that you must complete…testing your plan. Having a plan is not enough, you have to test to see if your plan has all the necessary information, if your backups work correctly, and if everyone knows what to do. In order to do this you should setup a fake disaster. Now, don’t go lighting your servers on fire (we all know how tempting that can be sometimes!), but use some extra hardware to test your plan. Don’t worry about getting exactly the same setup (hardware wise), you will need just enough to run the services and any client applications. When testing, you should follow your disaster recovery plan and see if all the information is available in the plan. If you left anything out, or something was wrong, now is the time to make corrections and additions. By using the test hardware you not only get a feel for what information needs to be in your plan, but you are also able to test your backups by restoring the server from tape.