Mistakes II: Backup Hell

Crossed fingers image - not a valid backup strategy
Not a good backup plan

Everyone pays lip service to backups, but I find that not too many people are actually doing them, and most of the ones that are do it poorly. Sounds like the high school locker room, actually, but I digress. The issue is simple, too:
We don’t care about backups; we care about restores.

The problem

Many backups systems are designed to make copies of data, without a real understanding of what it takes to use that data when the time comes.

Let me show you the common example: Tape drive backups. People love tapes because they can “Take a copy home with them”, “Just in case”. Great. Now flash forward to a meteor hitting your office.

Meteor
“It’s comin’ right for us!”

Nobody was injured, thankfully, but on top of dealing with the smoking hole where your office was, you’re scrounging for a tape drive to restore from. Let’s hope that this drive is a) available, b) working, c) can be shipped in a day, and d) can read your tapes (You did try restoring from them and swap them out after a hundred or so uses, right?).

Now, even if you had a slightly better setup, like using an automated off-site backup solution (Vault Logix, Mozy, Carbonite, etc.) or just had the foresight to buy two drives originally, you also need to ensure that you have ready-to-install copies of all the software required to read your data. This is easy enough for QuickBooks, but can get hairy for something that uses SQL Server to store the data.

Also, think about the 90th percentile situation: Someone deleted or changed something and it was a mistake. How long does it take for you to get that data back?

Tape drive
This has “bad day” written all over it

Yes, it really does have to rewind the whole tape and go through it from beginning to end to find the file you’re looking for. Yes, it’s not fast. Yes, I’ve seen 5 hour restore for a 2KB file. Most of us have better things to do.

Beer mug
Hey, look. A better thing to do!

Also, I’m finding that a lot of small companies only have the one backup system. Now, if this was some type of crazy, super reliable robot capable of Watson-like deductive skills, or said “EMC” on the front, then sure. However, at most places, what’s in charge of the backup system is made of meat and tends to forget things on occasion.

The solution

1: Figure out what you need to run your business

Take inventory of what software and data you need to run your business. Make sure you include the major server software like Exchange, and the little programs everyone uses to do their jobs. Keep this to a minimum, but be thorough.

2: Figure out what hardware you need to run all that stuff

You don’t need to match server-for-server or desktop-for-desktop, but if you need a label printer to meet your shipping demands, you should note that. Again, minimums here.

3: Back it up.

You’re going to tier your backups so that you make sure you never lose anything.

First off, let’s deal with your critical data that you need to run your business. You are going to need to back this up the most. Here’s how to do it:

  • Tier 0: Not really a backup tier, but critical. Can you install software on your computer? Can everyone? Can you write to all the file locations on the file server? Can everyone? Ya, you should probably stop that. Permissions aren’t there just to thwart unwarranted access, they also contain damage and remind you that no, you probably shouldn’t be doing that.
  • Tier 1: Using Windows or Linux? Use Shadow Copy or Snapshots to have an easy-to-get backup of everything on a disk. With minimal configuration, Windows will do this twice a day and give you 30 days back. More importantly, your users can do this without any wait! They can just get their own backups the next time they overwrite the QuickBooks file with pictures of kittens.
  • Tier 2: Backup Software. We like App Assure over here for backups mostly because it’s designed to work the way it should. You tell it what to backup and it does it every 15 minutes. Regardless of which software, you just pick your places that need backing up, give it a place to backup to, when to backup and who to mail/call/text/IM/smoke signal when things break. Then you test the hell out of it. First test, unplug the drive you’re backing up to or otherwise make it unavailable… did you get the notification? About 2 days in, you really want to do a full restore someplace (use an external drive if you can) and see if it came back correctly. Random sampling is you friend here.
  • Tier 3: Off site. Don’t take a disk home every night. Get an off-site backup system. Some are better than others, though so be aware that you need to make sure it’s whines loud enough for you to hear (metaphorically speaking) when it breaks. How do you do that? Break it! Have is backup an external drive and then unplug the drive… see if your software alerts you and how it alerts you.
  • Tier 4: Disaster Recovery Off-site: This one is optional, but if you can’t go a couple days without being on-line, or don’t have time to rebuild machines, It’s recommended. What you’re doing is making duplicates of your actual machines in the cloud that are available if the originals do down. This makes it trivial for you to come back after your office “goes away” from a remote location with only a few changes to the way the network is setup.

So where do tapes fit into this? Do you have more than 100TB of data to backup? No? Then they don’t. Use disk.

OK, so now what do we do with the rest of your data?

  • Data that seldom changes: This is things like install disks for software, archived e-mails, reference material, old company pictures. Back this up any way you like. I recommend an external disk of about 1-4TB in size. Copy all that stuff onto the disk and possibly make a second copy. If you have sensitive data on there, encrypt it and try not to forget the password. Take this off-site to safe locations. Refresh this copy every 6 months.
  • Machine Images: You will eventually need to recover some machines to get things working. This is a lot easier if you have backups of whole machines in a format that doesn’t need a lot of hoopla to restore. Acronis, Windows 2008-2016 backup, Windows 7 backups (still available in 10) and disk dumps are pretty common formats. With these we don’t need to spend 2 hours installing an OS before recovering the machine. These need to be updated fairly often, so it’s best to put them in with your regular backups or to use some software to move them off-site regularly. If you can actually remember to, combine these with the previous data and grab a monthly copy to go off-site.

4: Restore it

No rocket science here, but it is the step that’s rarely done. For each scenario we mentioned here, restore and make sure it works. You should write down what you needed to do and any gottchas in the process. Yes, this does mean everything from using “Previous Versions” on Windows, to rebuilding your entire system off-site and trying to place and deliver an order (probably fake ones for testing), but that’s how you know you won’t be doomed when the time comes.

Conclusions

  • Concentrate on restoring data, not backing it up.
  • Figure out what you need to backup to keep things running
  • Have multiple backups tiers that overlap to prevent a single failure from causing you huge problems.
  • Make sure you have all the software and hardware available off-site so that if you need to restore, you don’t have to wait for parts or software to come in.
  • Use automated solutions to back things up whenever possible.
  • Watch those backups to make sure they work.

Photo Credits: