Cloud Outages and . . . The Hitchhiker’s Guide to the Galaxy?
The content of this article was originally writen by Yuval Lubowich on the Xeround blog.
Yup, it happened again.
A little over a week ago, Amazon’s EC2 East coast data center suffered another outage affecting well-known sites such as Heroku, Pinterest, Quora, and HipChat.
Once again, we’re confronted by the fact that the notion of the cloud as being always available and never going down is just plain wrong (as we’ve seen several times before).
In the dynamic environment of the cloud- availability issues should be treated as a given. You know that they will eventually happen. So: you could be afraidddd, be very afraid! :) or, you could be prepared. Cloud users must be ready to minimize the risk and effectvly handle availability issues – but they shouldn’t panic…
As with many other things in life, cloud developers could be categorized roughly into three groups:
- Those who know that clouds are dynamic environments and bad things happen from time to time.
- Those that upon hearing such news, go “Wow, hold your horses there; how can a cloud go down?”
- Those that don’t really care, or more accurately – those that can wait it out until the service interruption has passed (clearly, won’t work for mission-critical applications).
Developers who are aware of the dynamic nature of clouds take extra care protecting their assets (applications and more importantly data) and trying to minimize business interruption as much as possible. A cloud high-availability / auto-failover solution should be implemented in such a way that does not require the application to be aware of it, that responds to any cloud failure immediately, and all in a painless manner as possible – so you do not need ot struggle to maintain service.
Developers might start with low-end solutions like backups, Elastic Load Balancers, manual redistribution of apps on different geographies – all of which are far from ideal. In fact they add additional burden in terms of management overhead and configuration changes, when you need to re-deploy servers, sync data, etc.
Alternatively, you can opt to go with higher-end solutions such as Xeround’s Database as a Service – that was engineered exactly for such eventualities by ensuring high availability in a zero-management, 1-click, service.
So needless to say– we hope your app and its availability in the cloud matters to you, and that you give us a try before the next cloud failure.
As for the developers who were not aware of the cloud’s inherit availability problem, they should use this blog as a wake up call and make sure they are protected.
Those that don’t really care or the ones that think ensuring high availability is about as fun as reading Vogon poetry! (The third worst poetry in the universe…) should take heed. You never know who might want to build an intergalactic highway through your cloud / data center when it would most matter…
Anyway, I’ve laid out towels for everyone.
So long and thanks for all the fish.