How Solana Intends to Mitigate its Outage Problem

2022-05-23, 07:50


Overnight between the 30th of April and the 1st of May, the Solana blockchain network experienced an outage that lasted 7 hours.

Within that time frame, starting at 20:30 UTC, the network’s mainnet Beta cluster was unable to reach a consensus causing it to halt the production of new blocks. Once they were alerted to the problem, the network’s validator operators began actively seeking the source after which they proceeded to resolve it by initiating a restart at about 3:30 AM.

The network was up and running once more, and later reports revealed that Solana had seen a massive influx of transactions around that time. The network was congested due to the 6 million transactions flooding it per second, with over 100 gigabits of traffic flowing through each node.

Solana acknowledged the outage in a Tweet and announced later on that it was back online.

Source: @SolanaStatus

The platform has since published a more detailed report noting the cause of the issue and its plans for long-term resolution. Here’s a comprehensive take on different parts of the outage problem that has plagued the Solana network and its users for quite a while now. Sit back and dig in.


Other Times Network Outages Have Occurred


This incident is the seventh of its kind that Solana has encountered this year. Solana recorded various incidents in January; some took place within about six days and resulted in 8-18 hours of partial outages and substandard performance from the network. The second occurred in late January and racked up over 29 hours of partial outages and network instability.

Solana attributed the first case to an increase in high-compute operations, which caused the network capacity to plummet from its purported 50K transactions per second (TPS) to some thousand. The platform clarified with the second incident stating that a rise in duplicate transactions had brought on the congestion and outages.

In early December, the network also went down after it suffered a distributed denial-of-service attack, known as a DDoS to the blockchain-savvy. However, a Solana-based NFT marketplace first pointed out the network’s lagging token distribution; Solana did not confirm this.

Although displeasing, the aforementioned incidents do not compare to the outage users witnessed in September 2021. To date, the longest Solana was offline for 17 hours due to a DDoS hit that saw bots flood it with transactions after an initial DEX offering on DeFi protocol Raydium went live.

The 400,000 transactions passing through the mainnet per second made it freeze and then cease to function. Alongside more than 1000 validators, the network engineers proposed a hard fork and got the green light from the majority of the stakeholders. The native SOL token nosedived by 35% but later bounced back.

Before we move on to the latest outage and what caused it, let’s have a brief overview of the Solana network and how its mainnet beta works for a better understanding later on.


The Solana Network


Solana is a major Ethereum competitor, one of the well-known “Ethereum killers.” Since software engineer Anatoly Yakovenko kickstarted the project in 2020, it has held its own and achieved remarkable success, thus earning the previously stated title.

One of Solana’s most significant selling points, an edge in its competition with Ethereum, is supposed to be its high speeds. The network boasts of swift and, importantly, cost-friendly operations, which attracted tons of investors and digital asset users. While this is good news, what followed was that it became congested, resulting in outages.

To remedy the arising issues, Solana’s mainnet beta went live without a hitch last year. However, the network is still tied up in certain problems. It’s important to note that this is still the beta version of its mainnet.

How Solana provides users with fast, affordable transactions is a combination of the Proof-of-Stake (PoS) and Proof of History (PoH) consensus mechanisms. Unfortunately, this system is vulnerable to exploitation by bots, and that brings up the question;


What Caused the Outage at the Start of this Month?


As stated earlier, the network began processing far more transactions than usual, and the figures soared from Solana’s average of 2700 Tps to millions of transactions instead. Blockchain explorers even show the network’s peak numbers to be just over 710,000 Tps, but more than quadrupled on the night of the outage.

Source:Tps history on Solana Explorer

According to Solana’s official diagnostic report, bots covered Solana-based Minting app Candy Machine, a tool several creators use to launch NFT collections. The bots aimed to participate in a new NFT mint which featured a fixed price in the place of an auction. By flooding the network with transactions, the bots were trying to increase the likelihood of their winning the token.

This spam caused Solana’s validators to crash as they struggled to process transactions and used up their memory. Developers have disclosed that most of the congestion problems resulted from bot activity centred around project mints. One could infer that as Solana’s place in the NFT industry becomes more defined, it could attract more of this.


How the Outages Impact the Network


Following the latest crash, Solana saw its token lose 7% of its value. The token had been trading at about $90 but fell to $84. Post-recovery it settled at $88, not as drastic as September’s decline but notable enough still. The drop could point toward a change in trader sentiment without a more permanent solution, and Solana could see user trust gradually disappear.

Source: Coinmarketcap, Solana price activity

Many of the platform’s users have lost funds due to these outages; however, this isn’t exactly an odd phenomenon in the DeFi space. Consistent security breaches could pose a real problem as it stands; some have taken the view that Solana’s benefits outweigh its risks. This does not invalidate the fact that Solana has to provide a conclusive answer to its issues.

Over the past few months, the network has come under fire as users, and prominent figures within the DeFi space have called out Solana’s failure to address the issue. At one point, Yakovenko’s seeming nonchalance in saying the problems were simply growing pains angered several. Thankfully, Solana has shared its plans to mitigate the network issues in its new report.


Mitigation


Solana has 3 major mitigation strategies.

QUIC
The dev team will implement core network components on QUIC, a Google protocol designed to facilitate the swift, asynchronous flow of data across RPC nodes and the current head. Solana presently uses a UDP-based protocol; however, in addition to being connectionless, the absence of features such as flow control and receipt acknowledgement renders it unable to curb abuse of the network.

QUIC provides a slew of options to optimize data flow, and through this, Solana will handle the reins regarding network traffic control.

Stake-weighted transaction QoS
Solana’s leader network bandwidth has a specific capacity, and to ensure it is used efficiently, the network must prioritise certain transactions. So far, transactions have been processed on a first-come-first-served basis, but Solana will now consider the source of these proposed operations.

The new model allows nodes with a 0.5% stake to send at least 0.5% of the packets to the leader, other nodes or a combination of the remnant stakes will now be able to bypass these.

Fee-based Execution Priority
This strategy will allow for modifying transactional data after it has entered the network. Before this, users have been unable to express the urgency of transactions as the network does not discriminate between submissions.

Solana is introducing a new instruction to the Compute Budget program. With this clause, users can request that the network collects an additional fee once the transaction is completed and added to a block. With this information, the network will weigh this fee compared to the compute units for the transaction and ettle on the priority. Solana will treat additional fees similarly to today’s base fees.

It is worthy of note that Metalex, the company behind Candy Machine, took on a portion of the blame for the latest outage. Metaplex confirmed on Twitter that traffic from bots on their app contributed to the crash. They have since revealed a plan to help combat the issues and improve network stability. Metaplex will introduce a botting penalty; wallets that attempt to complete invalid operations will be charged 0.01 Solana (SOL).

Growing pains are a typical part of any project; however, Solana seems to be maturing, having shared defined mitigation plans. With a loyal user base and continued improvement, things will likely turn out alright.



Author: Gate.io Observer: M. Olatunji
* This article represents only the views of the observers and does not constitute any investment suggestions.
*Gate.io reserves all rights to this article. Reposting of the article will be permitted provided Gate.io is referenced. In all other cases, legal action will be taken due to copyright infringement.
Share
gate logo
Credit Ranking
Complete Gate Post tasks to upgrade your rank