On Tuesday, a large-scale internet outage shut down major websites across the globe briefly, causing popular sites including Amazon, Reddit, Twitch, Spotify, and even the British government’s homepage to display “503 error” messages.
The outage’s roots were traced back to Fastly—a cloud computing platform that operates a network of servers strategically placed worldwide, which allow its back-end clients to move and store content nearby to various front-end users, thus conserving bandwidth and enabling swifter web browsing. According to the company, engineers were able to identify the problem 40 minutes after it was discovered, and within 49 minutes, 95% of its network was operating normally.
A spokesperson told Fast Company shortly thereafter that the issue was “a service configuration that triggered disruptions across our POPs globally”—but now, it’s offering more color on exactly what happened. As it turns out, a single Fastly customer inadvertently toppled the tentpoles of the internet, impacting websites including Fast Company and its sister publication, Inc.
In a blog post late Tuesday, Nick Rockwell, senior vice president of engineering and infrastructure at Fastly, wrote, “On May 12, we began a software deployment that introduced a bug that could be triggered by a specific customer configuration under specific circumstances. Early June 8, a customer pushed a valid configuration change that included the specific circumstances that triggered the bug, which caused 85% of our network to return errors.”
According to Fastly, it was previously unaware of the software bug, which was dormant until it suddenly ate through vast swaths of the World Wide Web. After conjuring back the webpages, the company created a permanent fix for the bug and is now investigating how it avoided detection during testing.
It also apologized for the disruption of its “mission critical services.”
The issue has raised concerns around how much of the internet—which is definitely “mission critical”—relies on just a handful of infrastructure architects. If one customer switching its settings could knock out our means of shopping, connecting, and consuming news, is that perhaps a recipe for disaster?
Apparently not for Fastly, whose stock was up 12% at one point Tuesday following the outage. Some speculate that could be related to the company’s quick and effective recovery, or the fact that many people were just alerted to the company’s existence and its vital role in propping up the internet. It has only a few competitors, including Akamai, Cloudflare, and Amazon’s CloudFront.
However, things aren’t as swell for its clients. According to a calculation from SEO group Reboot, Amazon alone could have lost $32 million in sales during the service glitch.