The digital economy was once again caught by surprise on October 20 when Amazon Web Services suffered its second major outage of the year, devastating exchange platforms such as Coinbase and Robinhood, as well as the analytics service Coinmarketcap. A second smaller outage occurred just 10 days later.
According to Amazon’s initial report, the Oct. 20 outage was caused by a malfunction affecting one of the internal subsystems that manage the domain name service, causing connectivity issues across multiple services. This was due to a flaw in an update that ended up breaking Amazon’s critical US-East-1 region. It’s a massive hub of servers that power many of the country’s top internet services. For two hours, users around the world lost access to numerous trading platforms, streaming services, payment providers, and gaming networks.
Make no mistake, Amazon’s engineers have been working overtime to resolve the outage, and it’s to the company’s credit that the majority of services that reported issues were back online within hours. However, this incident, which comes just months after a similar outage in Amazon’s EU-north-1 region, once again highlights the dangers of relying on centralized infrastructure. Going offline is painful for almost any business, but for the cryptocurrency industry, where billions of dollars in value are traded on an hourly basis, such an event is unacceptable.
Incalculable loss for traders
It’s very rare for a centralized cloud platform like Amazon to go down, but it does happen from time to time. And when it does happen, the impact is often huge, affecting millions, if not billions, of people around the world. Case in point, Amazon suffered a similar disruption just six months ago, shutting down Binance and KuCoin, two of the world’s largest cryptocurrency platforms, for several hours. That’s because not only Amazon, but rival clouds like Google and Microsoft Azure have also suffered catastrophic failures of their own. In fact, there were reports suggesting that Azure was down for several hours on October 29th, with numerous websites and online services taken offline.
The problem with centralized infrastructure is that it’s centralized. The weakness of these platforms is that they rely on critical components that create a single point of failure and, if taken offline, will crash the entire system. It could be something as simple as a computer server or database containing important configuration settings, or a single network connection with no redundancy. These vulnerabilities exist in all clouds and always pose a risk, no matter how diligent the operator is.
Coinbase was one of the first services to report issues following the Amazon incident and quickly responded to reassure users that their funds were safe. However, this clarity does not solve the fundamental problem of freezing trades and delaying market orders. These issues occur when the system goes offline without any warning. The longer the delay, the more the asset’s price may fluctuate and traders will not be able to take advantage of this. You may even incur a loss if the asset price falls soon after you enter a position and you are unable to sell it.
Although it is impossible to calculate the exact impact, the paralysis inflicted on traders may have caused distress and financial losses.
The era of decentralization has come
One possible way to prevent this is for cryptocurrency exchanges to at least partially switch to more resilient decentralized infrastructure that eliminates these single points of failure. By running several key modules of the trading system on a distributed network of servers, exchanges largely eliminate the possibility of such disasters.
For an industry that prides itself on decentralization and constantly extols its benefits, it seems hypocritical to rely on a weak, centralized cloud platform for its infrastructure. While blockchain networks are distributed across hundreds of nodes, few exchange platforms can say the same, choosing instead to host all their infrastructure on one or another cloud provider.
Fortunately, Monday’s outage was not as severe as previous incidents, as Amazon had most of its services back up and running within hours, but it should still serve as a wake-up call for the crypto industry to take action. Distributed cloud infrastructures still have issues with latency, network coordination, and scalability, but they are rapidly maturing to at least support hybrid cloud strategies. By distributing data and systems across a vast network, exchanges can be virtually immune to the complete blackouts caused by this type of outage.
Centralized clouds are here to stay because of their massive scale, high performance, enterprise-grade security, and specialized services that decentralized alternatives cannot match. They will likely remain the backbone of the internet for years to come, but they will never be able to replicate the resilience of decentralized alternatives. Cryptocurrency exchanges manage billions of dollars of customer funds in a market where every second counts, so they need to step up to prevent this from happening again.
