Towards the end of February, AWS suffered a serious S3 outage. S3 is a file storage service intimately connected to other Amazon services. Many websites use S3 to store static assets like images and scripts. When S3 became unavailable, the sites that depended on those assets stopped working including Amazon’s Status Page, which used S3 to store its icons.
It’s unfortunate that AWS suffered an outage, but it’s not unpredictable. Any complex infrastructure hosting platform will experience trouble from time-to-time. It’s the nature of complex systems to have unpredictable events, and sometimes those events cause outages. In an ideal world, systems would be built with enough redundancy that failures don’t negatively impact availability, but that still doesn’t guarantee 100% availability for all time.
The problem is not that AWS suffered an outage, but that so much of the web depends on a single platform and vendor: most infrastructure and operations people know putting all your eggs in one basket isn’t wise, and yet we often see outages of major online applications and services caused by companies doing exactly that.
There’s a real dollar cost to these outages. The largest online publishers and eCommerce stores stand to lose millions of dollars for every day their sites are unavailable. The cost of implementing redundant infrastructure that doesn’t depend on a single vendor or platform is trivial compared to the potential losses.
Outages can’t be blamed on the cloud, but it does make sense to have a non-cloud backup in the case of a serious outage on a major cloud platform. Because one or two major platforms dominate the cloud infrastructure hosting landscape there’s a tangled web of dependencies that can be hard to work out. To be truly independent of that web, businesses should implement redundant systems over which they have complete control.
Colocation is a great option for this type of redundancy. With colocation, companies know which hardware they have in the field, where it is, and who to call when something goes wrong.
Colocation doesn’t suffer from the opacity of more complex virtual infrastructure hosting platforms. If cloud storage goes down, there’s really nothing its users can do about it. All they can do is wait: twiddling their thumbs until it starts working again.
But, if you own the servers and know what’s installed on them, and how they’re supposed to work, you’re in a much better position to deal with any problems immediately and in a way that meets the specific requirements of your business.
Infrastructure diversity is essential for a healthy internet. The internet was designed to be resilient and to be able to route around failures. If everyone puts all their eggs in one basket without redundant backups they control, we’re asking for trouble.

