Amazon Web Services (AWS) recently experienced a widespread outage that affected big-name websites and multiple services. The blackout appeared to have affected the eastern side of the United States on Tuesday and lasted for several hours. According to downdetector.es, problem reports started appearing around 8pm BST and lasted past 11pm BST.
The AWS outage reportedly affected various services and websites, including MacDonald’s, Taco Bell, the Verge, the New York Metropolitan Transportation Authority, Southwest Airlines, the Boston Globe, and the US securities regulator’s EDGAR system. However, it is almost impossible to count the number of services and websites that were actually affected by the outage.
Other Amazon services like Amazon Music and Alexa were also affected. Several hours after Downdetector began showing outage reports, Amazon said that “the issue has been resolved and all AWS services are operating as normal.”
According to AWS Health Dashboard, degradation issues were noticed across multiple services in the US-East-1 region that started around 3pm ET (8pm BST). Amazon blamed the outage on “a subsystem responsible for AWS Lambda capacity management.” AWS Lambda is a service that allows customers to run computer programs without having to manage any underlying servers.
“Between 11:49 am PDT and 3:37 pm PDT, we experienced higher error rates and latencies for various AWS services in the US-EAST-1 Region. Our engineering teams immediately got involved and began to investigate,” Amazon said. “We quickly narrowed down the root cause to be an issue with a subsystem responsible for AWS Lambda capacity management… At 3:37 pm, the backlog was fully processed. The problem has been resolved and all AWS services are working normally.”
This isn’t Amazon’s first major outage; it had its last major outage in December 2021 when disruptions to its cloud services temporarily removed streaming platforms Netflix and Disney+, Robinhood, and Amazon’s e-commerce website before Christmas that year. AWS outages have occurred multiple times before then, and experts say these outages demonstrate the need for organizations to build redundancies into their operations.
Allegedly, AWS’s outage seemed smaller in time and breadth than the one the company suffered in 2017 from its data hosting service known as Amazon S3, which represents the bread and butter of its cloud business. The outage also appeared to extend to AWS’s own web page describing the disruptions to its operations, which at one point failed to load on Tuesday.
All things considered, while AWS’s recent outage was not as severe as some of its previous blackouts, it still highlights the importance of building redundancies into organizational operations. As more companies rely on cloud services for their day-to-day operations, it is crucial to have a backup plan in case an outage occurs.