A massive internet outage stemming from errors in Amazon's cloud services on Monday morning dramatically demonstrated just how many people rely on the corporate behemoth's computational infrastructure every day—and laid bare the vulnerabilities of an increasingly concentrated system.
Despite its critical omnipresence, however, most users do not truly understand what "the cloud" is or where it resides. Here is an explanation of the data centers in Northern Virginia where the outage originated, and what the malfunction reveals about this rapidly evolving industry.
Renting internet infrastructure
Cloud computing is a technology that allows companies to remotely access massive computing equipment and services without having to purchase and maintain their own physical infrastructure.
In essence, businesses ranging from social media platforms like Snapchat to food giants like McDonald's essentially rent Amazon's physical infrastructure, which is located in facilities all around the world. Instead of building expensive computing systems in-house, companies rely on Amazon Web Services (AWS) to store data, develop and test software, and deliver applications.
Amazon is the leading provider of cloud infrastructure and platform services, controlling over 41 percent of the market, according to market research group Gartner. Google and Microsoft are its next biggest competitors.
The biggest and oldest hub
Although "the cloud" sounds like an abstract, formless entity, its physical location is critically important: Proximity to cloud data centers directly determines how quickly users can access internet platforms.
AWS maintains just four primary cloud computing hubs in the U.S., strategically spread out in California, Ohio, Virginia, and Oregon to deliver fast services across the country. As Amro Al-Said Ahmad, a lecturer in computer science at Keele University in England, notes, a user's distance from the hub affects speed, stating: “If you're waiting a minute to use an application, you're not going to use it again”.
The region in Northern Virginia where Monday's problems originated, known as the US-East-1 region, is the biggest and oldest cloud hub in the country. Doug Madory, director of internet analysis at Kentik, explains that this Virginia cluster processes "orders of magnitude" more data than its nearest competitor in Ohio or its large West Coast hubs.
The theoretical idea of a big cloud provider like Amazon is that organisations can split their workloads across multiple regions so that the failure of one is manageable. However, Madory said, "the reality is it's all very concentrated".
“For a lot of people, if you're going to use AWS, you're going to use US-East-1 regardless of where you are on Planet Earth,” Madory explained. “We have this incredible concentration of IT services that are hosted out of one region by one cloud provider, for the world, and that presents a fragility for modern society and the modern economy”.
More than 100 warehouses
The servers supporting this massive operation are not located in just one building. Gartner analyst Lydia Leong states that Amazon operates "well over 100" of these sprawling computing warehouses in Virginia, mostly in the exurbs at the edge of the Washington metropolitan area.
Leong said one reason why it is Amazon's "single-most popular region" is that, in addition to being one of the oldest, it is increasingly becoming a hub for handling artificial intelligence workloads. The growing usage of chatbots, image generators, and other generative AI tools has spiked demand for computing power, leading to a construction boom of new data center complexes across the U.S. and the world.
A recent report on Monday from TD Cowen noted that leading cloud computing providers leased a "staggering" amount of U.S. data center capacity in the third fiscal quarter of this year, equivalent to more than 7.4 gigawatts of energy—more than all of last year combined.
ALSO READ: Amazon cloud outage resolved after massive worldwide internet disruption