Coinbase Says Outage ‘Unacceptable’ as CEO Weighs Speed-Resilience Tradeoffs


Key takeaways

Armstrong says resilience trade-offs will be revisited

Cryptocurrency Exchange Coinbase (Nasdaq: COIN) explained how an AWS data center cooling outage triggered a service outage that disrupted trading, exchange access and customer account data on the platform. Coinbase CEO Brian Armstrong addressed the incident on X, while Head of Engineering Rob Witoff detailed the recovery process and customer impact.

“We experienced an outage at Coinbase last night, which is never acceptable,” Armstrong wrote on May 8. He added that most Coinbase systems are designed to withstand downtime in an AWS Availability Zone, but the centralized exchange did not react this way during the outage. “It is possible to make exchanges resilient to AZ outages, but this can introduce latency delays that are not desirable as well as breaking client co-location,” Armstrong said, adding:

“Given this incident, we will revisit these compromises to ensure we offer you the best possible place to trade. At a minimum, the duration of an outage should be able to be significantly reduced when an AZ move is necessary.”

Armstrong noted that Coinbase would look at how it balances exchange speed, customer co-location, and recovery time after an infrastructure outage. His comments focused on reducing the impact and duration of future outages affecting customer access and business activity.

How Coinbase restored trading and balance updates

Rob Witoff, Coinbase’s head of engineering, posted on X that the disruption began late on May 7, when internal systems began to fail and emergency teams began to investigate. The outage affected spot, Prime, International and derivatives trading exchanges. Customers also experienced issues accessing exchange services, making transactions, and viewing account balances.

Witoff explained that trading was disrupted because trading systems were unable to continue operating safely during the infrastructure disruption. He also noted that internal email systems were slowing down, causing some account information to lag until the recovery process caught up. He recognized:

“Losing access to your account, even temporarily, is unacceptable.”

Recovery was handled in stages rather than all at once. Coinbase moved affected workloads away from the troubled area, restored the systems needed to process transactions, and allowed delayed customer data to catch up. Markets reopened cautiously, starting with cancel-only mode, followed by product checks, auction mode, and then restoration of trading on Coinbase Exchange.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *