Thinking “Out Cloud”, World Backup Day & Declaring 4/1 to be World Recovery Day

I went searching through the backups I have (yes, I have backups) of my old presentations from 1997-2002, when I was running a consulting practice and found a couple of slides I will use as the basis of this blog.

Both of these presentations were from 1998. I changed my approach from talking about Backup and Recovery, to Recovery and

Chapa Slide1.2

Nuggets of Wisdom From the Chapa Vault

Backup.  Too many of my clients were focused on how fast the backup can be completed, versus the quality of the restores.  So I made it my personal mission to force people to think differently.

My first job as a backup administrator was in 1986.  I was hired as the nighttime backup operator for an advertising agency in Venice Beach, California.  My shift started at 4pm, backup started at 6pm, and my shift ended when the last backup was successfully completed.  As a young backup admin, I had no idea the importance of the job I was performing. What I did know is that I had to sling some 9-track tapes, some 288 disk packs, and fill out the paperwork for data to be shipped offsite the next morning.  It wasn’t until there was a system crash, around 4:30pm one day, before I could really understand how important this job was to my company.

That day, I spent nearly 12 hours (until 3am) recovering data, not just from the last backup, but from several (last) backups. While I was performing backup religiously for the company, we didn’t have a plan or strategy to test our restores. Despite that, on this particular evening and into the next morning, we tested a lot of the backups. Fortunately, we only lost a couple of days, but the experience got me thinking that backup is blind if you don’t test the restores.  Ultimately, this realization led me to create this slide.

Chapa Slide 3

A Prudent & Profound Proclamation

 

I applaud the idea of World Backup Day, which was originally formed as an independent initiative to inspire the public to make regular backups of their data. But let’s not be fooled into thinking just because we take the pledge that we will be able to restore that data you pledged to backup.

So if March 31st is World Backup Day, then I claim April 1st, to be World Restore Day!  What better day than April Fool’s day to find out if your data is really and truly recoverable?

Too often we place way too much trust in the technology around us to work flawlessly. I found this to be true with one of my clients – a very large, multi-national telecommunications company – from my consulting days. I had just completed re-architecting its backup infrastructure which included the delivery of an Operations Guide, which firmly suggested regular recovery tests from the tape backup system.

After running this system for nearly a year, with no problems whatsoever, I received a frantic phone call from the Director of IT. They had an exchange server crash hard and could not recover the data from the backup tapes. When I arrived on-site, I asked all the regular questions about regular testing and found out that practice only continued three weeks after I had turned over the environment to the team. Besides calling me to help, unfortunately, the client also called a handful of other vendors who were all instructed to “go find the needle in the haystack” that caused these system issues.

My focus began on the backup system. After reviewing some logs I was able to identify which (tape) drive the tape had been written on and was ready to manually load it in this tape drive to recover the data. The problem was a hardware issue with the tape drive heads being just enough out of calibration to make it virtually impossible for any other tape drive to recover the data. Frustratingly, the tape library vendor had already replaced all 12 drives with brand new drives, and as I expected, the data still couldn’t be recovered.  There was not much more left for me to do, other than write up my report and submit it to client management for review. Finally, after 12 days, a third party company that specializes in recovering data off failed media, the data was finally restored.

Fast forward to today, and we still have issues with successful recovery – even when cloud backup is being used. Some solutions only provide the backup portion, but do nothing to ensure the reliability and recoverability of the data that is under management in the cloud. Some providers leave that up to the customer to test. This is something to consider on “World Backup Day”, when we raise awareness around protecting critical information; but what about getting it back again?

So, make the pledge on World Backup Day to protect your data, but don’t be a fool tomorrow by ignoring the pledge to test your recovery. Better yet why not test your cloud backup provider about how they intend on making your data available when you need it.

Oh by the way…remember that big telecommunications customer? Well after finally recovering its data after 12 days of downtime they dumped all of it on a “questionable” storage array. The admin didn’t think it was important to add it to that evening’s backup policy before he went home for the night.  I’ll give you one guess what happened the next morning.

Backup your data and know that your cloud provider has a process for ensuring your data is available when you need it most.

-Chapa, signing off

2015-03-31T17:50:36+00:00

About the Author: