Friday 14 August 2015

DBA Training Week 2 - Backups

The Only Reason We Back UpThe only reason we do backups is so we can do restores.
Brent here. When I first started out as a SQL Server DBA, I thought things were going well as long as the backup jobs ran successfully. I'd go into SQL Server Agent every now and then, make sure they were still running, and ... that was the end of it. I figured if disaster ever struck, I'd just do a restore. How hard could it be?

In theory, we test our backup strategy ahead of time with Kendra's 5 Simple Questions About Your Backups, and we've memorized the 9 Letters that Get DBAs Fired.
In practice, small disasters strike all the time. The most common reasons to do restores aren't to revive an entire server - it's just to get back a few small tables or an individual database. Somebody ran the wrong DELETE statement or dropped a database in production instead of development, and next thing you know, we're all scrambling. Let's think through a few things ahead of time to make the crisis easier.
Where to Do Restores
When you're restoring code (stored procedures, views, triggers, etc) or individual tables, don't restore onto the production server. I don't like touching production servers more than I have to, and let's face it - you're already having a bad enough day as it is. That's why you're doing a restore, remember? So let's do our work on a different server (like dev or QA) and leave production as it is. I've also written about restores in my ideal dev, test, and production environments.
After we've safely restored the right data onto another server, it's easy to copy that data across to other servers. For simplicity and security, you can set up a linked server on the production box with read-only access over to the restore server. Then, from production, you can run INSERT statements using a SELECT sourced from the linked server tables.
However, if you're restoring tables over 10GB, you'll probably want to do the restores directly on the production server to make the data copies faster. Just make sure you're extremely careful with the scripting and the database names - we don't want to restore over the top of your working production database.
This may require adding extra space to the production server. In one emergency, I freed up the necessary space by shrinking all of TempDB's data and log files down to just 1MB. TempDB was on fast drives, perfect for a one-time emergency restore, and that particular server didn't have any other activity happening due to the outage. We're not always so lucky, but it helps to think out of the box like that.
A word of warning: if referential integrity is involved, like if you're trying to restore tables that have relationships to other tables that you're NOT restoring, then you can be in for a world of hurt here. We're not going to cover that scenario - it really is different in each case.
Doing the Restore
Restore the most recent full backup using the WITH NORECOVERY option - this is really important. This leaves the database in a restoring state so that you can continue to apply additional backups to it. If you forget those two key words, your restore has to start over again from scratch, so please, for the love of all that's holy, double-check that option before you start the restore.
When I'm restoring code or config tables that haven't changed since the last full backups, I don't bother restoring any subsequent differential backups or transaction log backups. The goal is to finish the restore quickly.
Next, if differential backups are involved, restore the most recent differential WITH NORECOVERY. Differential backups are cumulative - you only have to restore the most recent one.
Next, restore all of the transaction log backups after the differential (or, if you don't have diffs, all of them after the full backup) - again, using WITH NORECOVERY.
Doing all of this with the GUI sucks. The more backups you have, the longer this takes, and the more likely you are to run into errors. Instead, what you need is a script that looks at all of the backups in a folder, plucks out the most recent relevant files, and restores them for you automatically, in order. We'll talk about automating restores in the next training module.

To learn more about backups and restores, our favorite getting-started articles are:
And if you want even more detail, check out our video classes:
This week, use coupon code BackItUp for $10 off either course.

Enjoy, and see you next week!

Brent says: I did my first emergency restore when I was working for a photo studio. I'd dropped out of college, and I took a job running their databases. Every morning, I got in bright and early to print out the list of labels for the high school graduates, weddings, babies, and so on that were going to be photographed that day. The photographers would pick up their stacks of labels for their film (FILM! remember that?) and head out into the field.

One morning, as part of my data cleanup process, I ran the DELETE statement. One minor problem - I'd forgotten to put in the WHERE clause, so I deleted all of the photo shoots, ever.

Thankfully, I was also the guy in charge of backups, so after I freaked out, I put last night's tape backups in and started the restores. When the boss came in to pick up his stack of labels, I was able to calmly explain what had happened and what I was doing to fix it.

Orville's temper was legendary, but I dodged a bullet and kept my job. Since then, whenever I handle a database, the very first question I ask is, "Where are the backups, and when was the last time they were tested?" Either the backups are tested, or you're about to be.

Missed a week? Here's what we covered so far:

1: Where My Servers At?


No comments:

Post a Comment