According to Odaily, Conduit has released a post-mortem analysis of the previous downtime incident of Degen Chain. On May 10, Conduit increased the batch size of Degen and Proof of Play Apex to 10MB to reduce costs, which delayed the batch release from these networks to their parent chain. On May 12, around 1 PM, this configuration was restored to fix the batch release. This led to reorganizations on both networks as batches were released after a 24-hour mandatory inclusion window.
Arbitrum Nitro would insert any inbox messages before any transactions in the batch and replay these transactions with new timestamps. After the reorganization, nodes would return with a corrupted database due to their depth not being well handled by geth. This required the data directory to be resynchronized from genesis. The synchronization time for each network exceeded 40 hours, with a replay rate of about 100M gas/s.
Once the nodes were resynchronized, Conduit attempted various transaction replay scenarios, although not all transactions could be recovered as some depended on precise timestamps. After consulting with each rollup team, Conduit discussed and concurrently tried various strategies to bring the networks online and restore their pre-reorganization state.
Degen Chain went live on May 14 at 7:30 PM, approximately 54 hours after the network paralysis. The Apex chain of Proof of Play was restored around the same time, but it was only available to the public at 4 PM on May 15, after another recovery plan was implemented.
In response to this, Conduit stated that it has improved the alerts and monitoring for the Orbit chain to cover this situation and is committed to working with Offchain Labs to enhance the observability of all Orbit chain operators. The team will continue to invest in and research mechanisms to better simulate mainnet conditions and transaction effective load in the test environment. The Degen Chain Explorer is now displaying the latest status of Degen Chain normally.