IT Brief New Zealand - Technology news for CIOs & IT decision-makers
Story image
Cloud, everything is perfect, until it goes wrong…
Wed, 14th May 2014
FYI, this story is more than a year old

Clouds are everywhere: public, private, hybrid… and with prices dropping almost daily, they are becoming more affordable to all sizes of businesses.

When everything is working fine, life is good – who cares where their data is stored, assuming it is encrypted, secure, accessible, backed up and, most importantly, recoverable.

Ah, that pesky word recoverable – even in the cloud, data recoverability is critical.

Traditionally backup has been slow, painful, expensive and often unreliable. How much has your company spent on backups over the past two, five, 10, 15 years?

How confident are you that you can restore all your data from your last backup? Who would describe backup as simple, trivial and a ‘no-brainer’, it just works and there’s no need to manage anything about the process – ever again?

Or, like the vast majority of companies struggling with backup complexity and reliability of restores. What would happen if a catastrophic event occurred right now?How long could you afford to have your systems down?

How much data are you prepared to lose? Who is really happy to have a complex process that few have mastered, and let another company (a cloud provider) take full responsibility for the protection and recoverability of their data?

Cloud crash – your problem!

What happens if a private or public cloud crashes? Are customers concerned that their data is lost forever? What will happen to their business?

What has your organisation done to protect its most important asset – its data? If, and when the big crash does come, will employees, shareholders and banks say, “Oh well, XYZ cloud provider said they backed up, but due to circumstances beyond their control all our data is unrecoverable…”

Moving infrastructure to the cloud delivers substantial benefits for most, however, an organisation still needs to take responsibility for recoverability. Speaking to many customers and resellers, one of the concerns they have about the cloud is ”What if the worst DOES happen and the cloud crashes?”

How diligent have they been in testing recoverability and the ability to restore operations quickly and efficiently.

Migrating real-time

Before worrying about cloud protection and recovery, consider how to get company data into the cloud. Traditional physical and virtual migration products are expensive, painful and highly disruptive for both physical and virtual servers.

Enter the new breed of real-time recovery solutions that deliver near zero impact migrations. These protect data in realtime every 15 minutes, even for complex databases.

In most businesses, only a relatively small amount of data changes at the sector level (the smallest unit of measure on disk) every 15 minutes.

These small real-time incremental sectorbased backups are replicated to the remote site – the cloud provider or data center.

Often the first or base backup is sent via ‘sneaker-net’ and USB/NAS devices, with the incrementals being replicated in real time.

Once the cloud site has caught up with production virtual and/or physical servers, a business can cut over to the cloud. At say 7pm, they force everyone to log off.

Create a last incremental, replicate it to the cloud and finalise the job at the cloud site, which means to spin up the customer’s production server in a virtual environment (typically Hyper-V or VMware), re-instate connections and when users log on again, they are running from the cloud with data from a few minutes ago.

From a technical perspective there are a few more steps. Equally compelling is the roll back process if something unexpected happens – simply turn on the production servers at the customer’s site, and everything works.

Recovery options

Every customer looks at its own specific requirements for data protection and recovery from the cloud. In general the same three principles apply in the cloud as at local site.

First the RTO (recovery time objective) – how long a business is prepared to be down. Critical applications may have a shorter RTO than non-critical applications.

Secondly the RPO (recovery point objective) - how far back in time you need to go to perform a ‘clean’ data restore.

This might be the point of the last backup, depending on whether it has worked correctly and if it is recoverable. The third principle is as critical, but often neglected - the TRO (test recovery objective) which should be the point in time to which a business is completely confident of restoring data.

A monthly or quarterly test will run the risk of substantial data loss. The latest real-time recovery solutions allow for non-intrusive, automated daily recovery tests to help ensure data recoverability from all backups, reducing risk substantially.

Location, location, location

Most businesses are advised to store their data in a minimum of two, if not three, locations with at least one being geographically remote and, where possible, perform regular recovery tests across all servers and sites to help maximise chances of recoverability.

If the process is automated, this should have zero or minimal impact on support staff. In the cloud, this should become a simple process. Perform a local backup (within your cloud provider), test the recoverability by automatically running Microsoft Checkdisk (helps to ensure data quality) on the backup volume regularly.

Next replicate these backups to a different cloud provider, where the automated recovery testing tool re-tests recoverability; then for critical data databases, replicate into a cloud that delivers real-time disaster recovery of critical servers with the ability to spin up critical servers in minutes and have them restored to a point no longer than 15 minutes ago.

We are starting to see more and more companies looking at replicating from the cloud back to their (old) production site, so this site becomes one of their disaster recovery locations.

After all, they have the infrastructure already in place.

Finally I cannot stress enough, especially with the cloud: test, test and then test again to ensure all data, databases and applications are recoverable quickly and reliably.

If a current backup product doesn’t give daily testing, find one that does.

Protection, with the ability to recover data in the cloud, is your responsibility.

By Greg Wyman, vice president Asia Pacific, StorageCraft