Walrus: Sui’s New Approach to Decentralized Storage

IntermediateSep 30, 2024
Discover Mysten Labs' decentralized storage network, Walrus, and how it innovates through the RedStuff coding algorithm. This article dives into the synergy between Walrus and Sui, compares competitors, addresses storage challenges, and highlights the core technological innovations.
Walrus: Sui’s New Approach to Decentralized Storage

Forward the Original Title‘解读 Walrus,Sui 的去中心化存储新解’

Arweave, a decentralized storage network, launched its computing layer AO, which successfully boosted the AR token price, ecosystem activity, and popularity—turning things around for the project. Now, Sui, a general-purpose computing blockchain, has launched the decentralized storage network Walrus. What kind of impact will it have?

Background Overview

Team:

The development company behind Solana is Solana Labs, behind Aptos is Aptos Labs, and behind Sui is Mysten Labs (which stands out as unique). Many of the founders and employees at Mysten Labs previously worked on Facebook’s (now Meta) blockchain project Diem before it was disbanded.

Walrus is the newest product from Mysten Labs, categorized as both a “protocol” and a “platform,” and it serves as a decentralized storage network. The name “Walrus” refers to the animal, and its official website promotes slogans like “Strong as a Walrus” and “Adaptable like a Walrus,” emphasizing the reliability and flexibility of the protocol as a storage system.

Connection with Sui:

Walrus is built on the Sui network and uses Sui to manage the sales of storage space and metadata. However, using Walrus doesn’t require developers to build apps or products on Sui. Additionally, a new governance token, WAL, will function as the utility token, rather than the SUI token.

Competitor Comparison

Decentralized storage protocols are usually divided into two main types. The first type is fully replicated systems, with prominent examples like Filecoin and Arweave. The key advantage of this approach is that files are fully available on each storage node, meaning that even if a node goes offline, the file can still be easily accessed and moved. This setup supports a permissionless environment since storage nodes don’t rely on one another to recover files.

The reliability of these systems depends heavily on the stability of the chosen storage nodes. In the classic one-third static adversary model, and assuming an infinite pool of potential storage nodes, achieving “twelve nines” security (a probability of file loss of less than 10^-12) requires more than 25 copies of the file to be stored across the network. This leads to a 25-times increase in storage costs. Additionally, there is a risk of Sybil attacks, where malicious actors can fake multiple copies of a file, reducing the system’s overall integrity.

The second type of decentralized storage service uses Reed-Solomon (RS) coding. RS coding splits a file into smaller parts, called slices, each representing a fraction of the original file. As long as the total size of these slices exceeds the size of the original file, it can be decoded back into the original. However, RS coding has some downsides. The encoding and decoding processes involve complex field operations, polynomial evaluations, and interpolation, which are computationally intensive. These processes are only manageable when the number of slices and field size is small, limiting the size of files and the number of storage nodes. If the numbers grow, the costs of encoding rise, making it less decentralized. Another challenge is that when a storage node goes offline and needs to be replaced, unlike fully replicated systems, the data can’t simply be copied over. Instead, all other storage nodes must send their slices to the replacement node, which then reconstructs the missing data. This process can result in a large amount of data being transferred across the network (O(|blob|)), and frequent recovery operations can eat into the storage savings achieved by reducing replication.

Storage Challenges

No matter which replication protocol is used, all current decentralized storage systems face two additional key challenges:

  1. Continuous verification is necessary to ensure that storage nodes are retaining the data and not discarding it. This is critical for open decentralized systems that offer payment for storage, but the current approach limits scalability because each file requires its own individual verification challenge.
  2. Coordination among storage nodes is required: Nodes need to know who is participating in the system, which files have had storage fees paid, how to incentivize participation, and how to handle verification challenges and prevent abuse. For this reason, many of these systems have implemented custom blockchains to process transactions and introduced cryptocurrencies beyond the basic storage protocol.

Core Innovation

How does Walrus address the challenges of decentralized storage with a fresh solution?

In short:
Walrus uses an advanced erasure coding technology that efficiently encodes unstructured data blocks into smaller fragments, which are then distributed across a network of storage nodes. Even if up to two-thirds of these fragments are lost, the original data can still be quickly reconstructed from the remaining fragments. This is achieved with a replication factor of just 4 to 5 times, similar to current cloud services but with the added benefits of decentralization and increased fault tolerance.

In detail:
Walrus introduces RedStuff, a novel 2D coding algorithm designed for Byzantine Fault Tolerance (BFT). RedStuff is based on fountain codes and combines the speed of operations with high reliability. \
RedStuff encodes data into primary and secondary fragments using simple operations (primarily XOR). These fragments are distributed across storage nodes, with each node holding a unique combination. RedStuff uses different thresholds for different encoding dimensions. For the primary dimension, it employs an f+1 recovery threshold, which enables asynchronous writing since only 2f+1 signatures are required to confirm that the data block is available. This creates a replication factor of 3 times.

The secondary dimension uses a recovery threshold of 2f+1. This design implements asynchronous storage proof for the first time while only introducing 1.5 times additional replication.The final total replication factor is less than 5x. What’s more, lost slices can be recovered based on the amount of lost data, thus saving bandwidth, all thanks to 2D encoding.

RedStuff benefits include:Compared with RS encoding, using simple XOR operations makes encoding/decoding faster; due to low storage overhead, the system can be expanded to hundreds of nodes and has high elasticity and fault tolerance, ensuring that even in the case of Byzantine faults Data can be recovered.

As a permissionless protocol, Walrus is equipped with an efficient committee reconfiguration protocol to cope with the natural loss of storage nodes and ensure continuous availability of data. When a new committee replaces the current committee between epochs, the reconfiguration protocol ensures that all data blocks that have exceeded the point of availability (PoA) remain available. RedStuff’s 2D encoding makes state migration more efficient, and even if some nodes are unavailable, other nodes can recover lost slices.


Nodes 1 and 3 assist Node 4 in recovering slice data.

Walrus has introduced an asynchronous challenge protocol to verify whether storage nodes are correctly holding data. This protocol allows for efficient proof of storage, ensuring data availability without relying on network assumptions, and the costs scale logarithmically with the number of stored files.

Walrus’s economic model is built around staking, incorporating both reward and penalty systems. The innovative proof-of-storage mechanism scales logarithmically with the number of files, significantly lowering the cost of proving data storage.

In short, Walrus, with its RedStuff protocol at the core, offers a decentralized storage solution that is scalable, highly resilient, and cost-effective, delivering strong authenticity, integrity, auditability, and availability at an affordable price.

All of this is supported by Sui, which acts as the control layer for Walrus. With a scalable, programmable, and secure infrastructure as its coordination layer, Walrus can focus on solving the key challenges of decentralized storage.

Potential Airdrop

Walrus plans to introduce its own token, WAL, which will be used for staking, governance, and other utilities. How can you get in on the WAL airdrop? Based on how AO tokens were distributed, holding SUI might be one of the ways to qualify.
Walrus is expected to launch its testnet soon, though the mainnet launch date is still to be determined. In the meantime, you can check the official documentation to learn how to deploy your own website using Walrus.

Sources:
Walrus Whitepaper:
https://docs.walrus.site/walrus.pdf Walrus: Decentralized storage and DA protocol, capable of building L2 and large-scale storage on Sui:
https://foresightnews.pro/article/detail/63040 Mysten Labs Researcher X thread:
https://x.com/LefKok/status/1836868240666153293

Disclaimer:

  1. This article is reprinted from [ForesightNews]. Forward the Original Title‘解读 Walrus,Sui 的去中心化存储新解’. All copyrights belong to the original author [Alex Liu, Foresight News]*. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.
* Le informazioni non sono da intendersi e non costituiscono consulenza finanziaria o qualsiasi altro tipo di raccomandazione offerta da Gate.io.
* Questo articolo non può essere riprodotto, trasmesso o copiato senza menzionare Gate.io. La violazione è un'infrazione della Legge sul Copyright e può essere soggetta ad azioni legali.

Walrus: Sui’s New Approach to Decentralized Storage

IntermediateSep 30, 2024
Discover Mysten Labs' decentralized storage network, Walrus, and how it innovates through the RedStuff coding algorithm. This article dives into the synergy between Walrus and Sui, compares competitors, addresses storage challenges, and highlights the core technological innovations.
Walrus: Sui’s New Approach to Decentralized Storage

Forward the Original Title‘解读 Walrus,Sui 的去中心化存储新解’

Arweave, a decentralized storage network, launched its computing layer AO, which successfully boosted the AR token price, ecosystem activity, and popularity—turning things around for the project. Now, Sui, a general-purpose computing blockchain, has launched the decentralized storage network Walrus. What kind of impact will it have?

Background Overview

Team:

The development company behind Solana is Solana Labs, behind Aptos is Aptos Labs, and behind Sui is Mysten Labs (which stands out as unique). Many of the founders and employees at Mysten Labs previously worked on Facebook’s (now Meta) blockchain project Diem before it was disbanded.

Walrus is the newest product from Mysten Labs, categorized as both a “protocol” and a “platform,” and it serves as a decentralized storage network. The name “Walrus” refers to the animal, and its official website promotes slogans like “Strong as a Walrus” and “Adaptable like a Walrus,” emphasizing the reliability and flexibility of the protocol as a storage system.

Connection with Sui:

Walrus is built on the Sui network and uses Sui to manage the sales of storage space and metadata. However, using Walrus doesn’t require developers to build apps or products on Sui. Additionally, a new governance token, WAL, will function as the utility token, rather than the SUI token.

Competitor Comparison

Decentralized storage protocols are usually divided into two main types. The first type is fully replicated systems, with prominent examples like Filecoin and Arweave. The key advantage of this approach is that files are fully available on each storage node, meaning that even if a node goes offline, the file can still be easily accessed and moved. This setup supports a permissionless environment since storage nodes don’t rely on one another to recover files.

The reliability of these systems depends heavily on the stability of the chosen storage nodes. In the classic one-third static adversary model, and assuming an infinite pool of potential storage nodes, achieving “twelve nines” security (a probability of file loss of less than 10^-12) requires more than 25 copies of the file to be stored across the network. This leads to a 25-times increase in storage costs. Additionally, there is a risk of Sybil attacks, where malicious actors can fake multiple copies of a file, reducing the system’s overall integrity.

The second type of decentralized storage service uses Reed-Solomon (RS) coding. RS coding splits a file into smaller parts, called slices, each representing a fraction of the original file. As long as the total size of these slices exceeds the size of the original file, it can be decoded back into the original. However, RS coding has some downsides. The encoding and decoding processes involve complex field operations, polynomial evaluations, and interpolation, which are computationally intensive. These processes are only manageable when the number of slices and field size is small, limiting the size of files and the number of storage nodes. If the numbers grow, the costs of encoding rise, making it less decentralized. Another challenge is that when a storage node goes offline and needs to be replaced, unlike fully replicated systems, the data can’t simply be copied over. Instead, all other storage nodes must send their slices to the replacement node, which then reconstructs the missing data. This process can result in a large amount of data being transferred across the network (O(|blob|)), and frequent recovery operations can eat into the storage savings achieved by reducing replication.

Storage Challenges

No matter which replication protocol is used, all current decentralized storage systems face two additional key challenges:

  1. Continuous verification is necessary to ensure that storage nodes are retaining the data and not discarding it. This is critical for open decentralized systems that offer payment for storage, but the current approach limits scalability because each file requires its own individual verification challenge.
  2. Coordination among storage nodes is required: Nodes need to know who is participating in the system, which files have had storage fees paid, how to incentivize participation, and how to handle verification challenges and prevent abuse. For this reason, many of these systems have implemented custom blockchains to process transactions and introduced cryptocurrencies beyond the basic storage protocol.

Core Innovation

How does Walrus address the challenges of decentralized storage with a fresh solution?

In short:
Walrus uses an advanced erasure coding technology that efficiently encodes unstructured data blocks into smaller fragments, which are then distributed across a network of storage nodes. Even if up to two-thirds of these fragments are lost, the original data can still be quickly reconstructed from the remaining fragments. This is achieved with a replication factor of just 4 to 5 times, similar to current cloud services but with the added benefits of decentralization and increased fault tolerance.

In detail:
Walrus introduces RedStuff, a novel 2D coding algorithm designed for Byzantine Fault Tolerance (BFT). RedStuff is based on fountain codes and combines the speed of operations with high reliability. \
RedStuff encodes data into primary and secondary fragments using simple operations (primarily XOR). These fragments are distributed across storage nodes, with each node holding a unique combination. RedStuff uses different thresholds for different encoding dimensions. For the primary dimension, it employs an f+1 recovery threshold, which enables asynchronous writing since only 2f+1 signatures are required to confirm that the data block is available. This creates a replication factor of 3 times.

The secondary dimension uses a recovery threshold of 2f+1. This design implements asynchronous storage proof for the first time while only introducing 1.5 times additional replication.The final total replication factor is less than 5x. What’s more, lost slices can be recovered based on the amount of lost data, thus saving bandwidth, all thanks to 2D encoding.

RedStuff benefits include:Compared with RS encoding, using simple XOR operations makes encoding/decoding faster; due to low storage overhead, the system can be expanded to hundreds of nodes and has high elasticity and fault tolerance, ensuring that even in the case of Byzantine faults Data can be recovered.

As a permissionless protocol, Walrus is equipped with an efficient committee reconfiguration protocol to cope with the natural loss of storage nodes and ensure continuous availability of data. When a new committee replaces the current committee between epochs, the reconfiguration protocol ensures that all data blocks that have exceeded the point of availability (PoA) remain available. RedStuff’s 2D encoding makes state migration more efficient, and even if some nodes are unavailable, other nodes can recover lost slices.


Nodes 1 and 3 assist Node 4 in recovering slice data.

Walrus has introduced an asynchronous challenge protocol to verify whether storage nodes are correctly holding data. This protocol allows for efficient proof of storage, ensuring data availability without relying on network assumptions, and the costs scale logarithmically with the number of stored files.

Walrus’s economic model is built around staking, incorporating both reward and penalty systems. The innovative proof-of-storage mechanism scales logarithmically with the number of files, significantly lowering the cost of proving data storage.

In short, Walrus, with its RedStuff protocol at the core, offers a decentralized storage solution that is scalable, highly resilient, and cost-effective, delivering strong authenticity, integrity, auditability, and availability at an affordable price.

All of this is supported by Sui, which acts as the control layer for Walrus. With a scalable, programmable, and secure infrastructure as its coordination layer, Walrus can focus on solving the key challenges of decentralized storage.

Potential Airdrop

Walrus plans to introduce its own token, WAL, which will be used for staking, governance, and other utilities. How can you get in on the WAL airdrop? Based on how AO tokens were distributed, holding SUI might be one of the ways to qualify.
Walrus is expected to launch its testnet soon, though the mainnet launch date is still to be determined. In the meantime, you can check the official documentation to learn how to deploy your own website using Walrus.

Sources:
Walrus Whitepaper:
https://docs.walrus.site/walrus.pdf Walrus: Decentralized storage and DA protocol, capable of building L2 and large-scale storage on Sui:
https://foresightnews.pro/article/detail/63040 Mysten Labs Researcher X thread:
https://x.com/LefKok/status/1836868240666153293

Disclaimer:

  1. This article is reprinted from [ForesightNews]. Forward the Original Title‘解读 Walrus,Sui 的去中心化存储新解’. All copyrights belong to the original author [Alex Liu, Foresight News]*. If there are objections to this reprint, please contact the Gate Learn team, and they will handle it promptly.
  2. Liability Disclaimer: The views and opinions expressed in this article are solely those of the author and do not constitute any investment advice.
  3. Translations of the article into other languages are done by the Gate Learn team. Unless mentioned, copying, distributing, or plagiarizing the translated articles is prohibited.
* Le informazioni non sono da intendersi e non costituiscono consulenza finanziaria o qualsiasi altro tipo di raccomandazione offerta da Gate.io.
* Questo articolo non può essere riprodotto, trasmesso o copiato senza menzionare Gate.io. La violazione è un'infrazione della Legge sul Copyright e può essere soggetta ad azioni legali.
Inizia Ora
Registrati e ricevi un buono da
100$
!