You’ve seen podcasts, blogs, whitepapers, and magazine articles all promoting backups, the concept, the benefits they provide, and the career crushing disasters that can happen when they’re ignored. Bottom line: You do it, and you do it now. Hey, do more than one, the more the merrier. Run at the right time (meaning low user load), and in a well maintained hierarchical storage process, they can only do good. A solid backup strategy can even get you business insurance discounts in some cases.
But backing up in-house servers is a relatively simple and time-ed process. Most operating systems have one or more backup utilities built-in, and then there are third-party options that take the concept to whole new level. But backup in or from the cloud is a new concept with new variables – data security, multi-tenant segregation, virtualized services, and especially cloud and internet bandwidth. That’s got to be difficult at the enterprise computing level. Wrong! Not for AWS customers.
For a while, cloud storage was mainly a backup solution on its own. But Amazon has matured its backup and data redundancy offerings over the years to a point where you’ll find a range of sophisticated options at your disposal, depending on your needs. Obviously, we start with S3.
S3 has been and still is a basic backup solution for many companies – some rely on it exclusively, since data in S3 gets additional redundancy and backup features in AWS’ datacenters, so customers can be fairly confident their data will be there when they need it. S3’s storage architecture is pleasantly simple to understand. You’ll find it revolves primarily around two key nouns: buckets and objects. Buckets are just ways to organize your objects, and objects represent every individual file that’s been backed up. Every object has its own secure URL, which allows easy organization and management of data using a number of homegrown tools like basic access control or Amazon prescribed bucket polices. But you’re also able to choose from new third-party solutions with even easier interfaces, like Symantec for enterprises or Dropbox for small businesses.
Recently, Amazon has fleshed out its backup solution even more with the introduction of Glacier, which doesn’t relate to melting polar ice caps as much as it does to slow data retrieval times. Glacier’s mission is to provide a much lower-cost backup solution (as little as a penny per gigabyte in some situations). The tradeoff is that because its low cost, it’s significantly slower than S3. But for long-term backup of important files, Glacier removes the need for repetitive backup operations, capacity planning, hardware provisioning and more. These are all time consuming tasks that add to the hidden costs of secure in-house backup. Glacier takes all that off your plate for very little money. Bottom line: use S3 if you’re storing data you’ll access often or that requires high-speed access; use Glacier for less frequently accessed data for which slow retrieval times (sometimes several hours) are acceptable.
That’s only a short glimpse into the S3 portion of Amazon’s backup capabilities. If we went into detail about all of AWS data protection features, we could publish this as a book. We’ll hit each of them in detail in future posts, but here’s a quick list of Amazon’s data protection solutions across their service portfolio:
- Multi-zone and multi-region redundancy
- AWS relational database service (RDS) bundled, automated back up
- RDS backups you can control manually, like snapshots
- AWS Import/Export
- EC2 and S3
- S3 and Elastic Block Store (EBS)
- Route 53
- Elastic Load Balancing (ELB)
- Third-party backup platforms you can use to construct your own AWS storage hierarchy
Data safety has been a hot-button issue for folks still on the fence about adopting the cloud. But AWS has done an excellent job innovating sophisticated new data redundancy solutions that can make backing up to the cloud safer than in your own datacenter. Check it out.
-Travis Greenstreet, Senior Cloud Systems Engineer