Grass: A Decentralized Data Network for AI

Intermediate12/9/2024, 8:31:47 AM
The article introduces in detail the technical architecture of Grass, including the roles and functions of validators, routers, nodes and ZK processors, as well as the importance of the Grass data ledger, discusses the operation of Grass nodes and the reputation scoring system, and the Grass generation The various participation methods of the currency provide users with a fair and open AI data layer through its unique incentive structure and security mechanism, while ensuring the transparency and security of the data.

Forward the Original Title: Grass: Monetizing Internet Resources Through Decentralized Data Sharing

Grass’s positioning and usage scenarios

Grass is a decentralized data aggregation network built on the Solana blockchain, leveraging AI, DePIN, and Solana technologies to establish a foundational data layer for AI. By tapping into unused internet bandwidth, Grass empowers companies and nonprofits to train AI models while redefining the monetization of internet resources. Through a browser extension, users can share idle bandwidth in exchange for Grass Points rewards. This incentivized system aims to democratize the value of the internet, ensuring that users benefit directly while retaining control. With over 2 million active nodes, Grass has already facilitated the collection of substantial datasets for AI training.

Technical Architecture

Grass’s infrastructure is supported by a specialized Data Rollup on Solana, designed to manage the complete lifecycle of data — sourcing, processing, validation, and dataset construction. The architecture revolves around the following components:

Validator: Validators receive, verify, and batch web transactions from Routers. They generate zero-knowledge (ZK) proofs to verify session data, which are stored on-chain for reference. Grass is transitioning from a centralized Validator model to a decentralized committee structure.

Router: Routers connect Grass Nodes with Validators, managing bandwidth routing and reporting metrics such as throughput, latency, and node activity. Routers incentivize node operators based on the validated bandwidth relayed.

Grass Nodes: Grass Nodes utilize users’ idle bandwidth to relay public web data (excluding personal data). Running a node is free, and operators are rewarded based on the volume of data relayed through their nodes.

ZK Processor: The ZK Processor aggregates session data proofs for web requests and submits them to Solana’s Layer 1 blockchain. This ensures every network action is transparently recorded, enabling full traceability of AI training data.

Grass Data Ledger: This immutable repository stores collected datasets and their corresponding ZK proofs. By linking datasets with their provenance metadata, the ledger ensures data integrity and lineage throughout the lifecycle.

Edge Embedding Models: These models process unstructured web data into structured formats suitable for AI. Tasks include data cleaning, normalization, and transformation to meet AI requirements.

Technical Features

Grass functions as an intermediary between clients and web servers. Clients submit web requests, which Validators process and route through Grass Nodes. Regardless of the website requested, the server responds, allowing its data to be collected, cleaned, and prepared for use in training AI models.

This process relies on two critical components: the Grass Data Ledger and the ZK Processor.

Grass Data Ledger acts as the permanent storage for all datasets collected by Grass. Each dataset is embedded with metadata to document its origin from the point of collection. Metadata proofs are stored on Solana’s settlement layer, while the data itself resides in the ledger.

The purpose of the ZK Processor is to help record the origins of datasets scraped on the Grass network. The process works as follows: when a node on the network (i.e., a user with the Grass extension installed) sends a web request to a given website, it returns an encrypted response containing all the data requested by the node. This is the moment when the dataset is born - the moment of origin that needs to be recorded, and also when metadata is recorded. It contains many fields such as session keys, scraped website URLs, target website IP addresses, transaction timestamps, and of course the data itself. Thanks to these necessary details and datasets with clear website sources, AI models can be trained correctly and faithfully.

The ZK processor enables data requiring on-chain settlement to remain hidden from Solana validators. Furthermore, the high volume of future Web requests to be executed on Grass will exceed L1’s throughput capacity. Grass will soon scale to handle tens of millions of Web requests per minute, with metadata from each request requiring on-chain settlement. Without the ZK Processor first generating proofs and batch processing, it would be impossible to submit these transactions to L1. Therefore, Rollups are the only viable method to achieve the planned objectives.

In addition to recording the source website, Grass tracks the network path through which data is routed. Each Grass Node contributing bandwidth to web scraping is rewarded proportionally to its contributions. Higher contributions in terms of data volume or value lead to greater rewards. This approach incentivizes participation, especially in high-demand regions, expanding network capacity and data coverage. As the network grows, Grass’s data repository expands, creating more opportunities to provide diverse datasets for AI labs and further driving network growth.

Grass Node Operation and Security

Grass nodes operate free of charge, serving as gateways that connect the network to the broader internet. Node operators (i.e., application users) are rewarded based on the volume of traffic relayed through their nodes. Their rewards also depend on reputation scores and regional demand for network traffic.

Grass nodes have two primary functions: to relay traffic requested by clients and directed by validator (i.e., web requests); and to return encrypted web server responses to the designated routers.

The systems supported by the node are shown in the figure above, and the process of running the node is also very simple: create an account, download the Grass desktop application, and connect to the network.

Once connected, nodes automatically register on the network. Operators maintain network uptime, enabling nodes to forward requests to public web servers. Each request sent to a Grass Node is encrypted into a data packet, which only provides routing instructions to its destination. Network requests are authenticated via digital signatures from all involved parties, ensuring their legitimacy and verifying whether the request should be forwarded to the target web server. This encryption process prevents data tampering and allows Validators to accurately measure node reputation.

Node reputation score mainly includes the following points:

Data Integrity: Evaluate whether the data is complete and whether the data set contains all necessary data points required for the intended use case.

Consistency: Check data consistency across different datasets or within the same dataset over time.

Timeliness: Measure whether data is up to date when needed.

Availability:Evaluate the uptime and reliability of node performance.

Grass ensures that user devices are not accessed or monitored and does not track user activity. Grass routes only public internet traffic through users’ IP addresses, with no access to personal data. All aggregated data originates solely from public web resources.

To protect users, Grass encrypts bandwidth and collaborates with AppEsteem, a leading cybersecurity compliance auditing firm. AppEsteem monitors Grass products 24/7 for vulnerabilities, leaks, backdoors, or malware. Achieving AppEsteem certification, which is highly respected in the industry, ensures Grass is whitelisted by top anti-malware programs, including Avast, Microsoft Defender, McAfee, and AVG.

Grass Token Functionality

Grass token holders can participate in the network via:

Transactions and Buybacks: After decentralization, Grass tokens will support network scraping transactions, dataset purchases, and real-time context retrieval (LCR).

Staking and Rewards: Grass tokens can be staked in Routers to facilitate network traffic and earn rewards for contributing to network security.

Governance: Token holders can propose and vote on network improvements, partnerships, and incentive mechanisms for stakeholders.

According to Dune, Grass offers an annual staking yield of approximately 45%, with about 33% of Grass tokens (over 26 million) currently staked.

Router Staking and Rewards

Routers act as decentralized hubs connecting all network nodes and managing incoming and outgoing Validator web requests. Router operators are incentivized, with rewards proportional to the staked amount delegated to each Router. All traffic relayed through Routers is encrypted and metered to ensure security and performance.

Currently, staking levels for various Routers are illustrated in the chart above. Users can stake Grass tokens with their chosen Router to earn rewards, each charging a different commission rate.

For example, DBunker has staked approximately 1.43 million Grass tokens with a minimum staking period of 7 days and a commission rate of 10% (Data source: https://www.grassfoundation.io/stake/delegations). Users can stake Grass tokens to a Router via the Grass website by simply clicking “STAKE,” connecting their wallets, and earning Router staking rewards.

Conclusion

Grass aims to build a fair, open, and decentralized data layer to address the ethical and quality challenges of internet data extraction while opposing monopolistic data control by large corporations. Its technical architecture, including the Grass Data Rollup, integrates metadata recording mechanisms for dataset origins. These ZK proofs stored on the L1 settlement layer ensure transparency and reward node operators proportionally to their contributions, driving network expansion.

Positioned at the intersection of crypto and AI, Grass stands out as a decentralized alternative to centralized AI data providers. With its innovative architecture and strong focus on Web3, Grass offers promising growth potential as it builds a just and open data layer for AI protocols and companies.

Disclaimer:

  1. This article is reproduced from [medium]. Forward the Original Title: Grass: Monetizing Internet Resources Through Decentralized Data Sharing. The copyright belongs to the original author [EbunkerChinese]. If you have any objection to the reprint, please contact Gate Learn Team, the team will handle it as soon as possible according to relevant procedures.
  2. Disclaimer: The views and opinions expressed in this article represent only the author’s personal views and do not constitute any investment advice.
  3. The Gate Learn team translated the article into other languages. Copying, distributing, or plagiarizing the translated articles is prohibited unless mentioned.

Grass: A Decentralized Data Network for AI

Intermediate12/9/2024, 8:31:47 AM
The article introduces in detail the technical architecture of Grass, including the roles and functions of validators, routers, nodes and ZK processors, as well as the importance of the Grass data ledger, discusses the operation of Grass nodes and the reputation scoring system, and the Grass generation The various participation methods of the currency provide users with a fair and open AI data layer through its unique incentive structure and security mechanism, while ensuring the transparency and security of the data.

Forward the Original Title: Grass: Monetizing Internet Resources Through Decentralized Data Sharing

Grass’s positioning and usage scenarios

Grass is a decentralized data aggregation network built on the Solana blockchain, leveraging AI, DePIN, and Solana technologies to establish a foundational data layer for AI. By tapping into unused internet bandwidth, Grass empowers companies and nonprofits to train AI models while redefining the monetization of internet resources. Through a browser extension, users can share idle bandwidth in exchange for Grass Points rewards. This incentivized system aims to democratize the value of the internet, ensuring that users benefit directly while retaining control. With over 2 million active nodes, Grass has already facilitated the collection of substantial datasets for AI training.

Technical Architecture

Grass’s infrastructure is supported by a specialized Data Rollup on Solana, designed to manage the complete lifecycle of data — sourcing, processing, validation, and dataset construction. The architecture revolves around the following components:

Validator: Validators receive, verify, and batch web transactions from Routers. They generate zero-knowledge (ZK) proofs to verify session data, which are stored on-chain for reference. Grass is transitioning from a centralized Validator model to a decentralized committee structure.

Router: Routers connect Grass Nodes with Validators, managing bandwidth routing and reporting metrics such as throughput, latency, and node activity. Routers incentivize node operators based on the validated bandwidth relayed.

Grass Nodes: Grass Nodes utilize users’ idle bandwidth to relay public web data (excluding personal data). Running a node is free, and operators are rewarded based on the volume of data relayed through their nodes.

ZK Processor: The ZK Processor aggregates session data proofs for web requests and submits them to Solana’s Layer 1 blockchain. This ensures every network action is transparently recorded, enabling full traceability of AI training data.

Grass Data Ledger: This immutable repository stores collected datasets and their corresponding ZK proofs. By linking datasets with their provenance metadata, the ledger ensures data integrity and lineage throughout the lifecycle.

Edge Embedding Models: These models process unstructured web data into structured formats suitable for AI. Tasks include data cleaning, normalization, and transformation to meet AI requirements.

Technical Features

Grass functions as an intermediary between clients and web servers. Clients submit web requests, which Validators process and route through Grass Nodes. Regardless of the website requested, the server responds, allowing its data to be collected, cleaned, and prepared for use in training AI models.

This process relies on two critical components: the Grass Data Ledger and the ZK Processor.

Grass Data Ledger acts as the permanent storage for all datasets collected by Grass. Each dataset is embedded with metadata to document its origin from the point of collection. Metadata proofs are stored on Solana’s settlement layer, while the data itself resides in the ledger.

The purpose of the ZK Processor is to help record the origins of datasets scraped on the Grass network. The process works as follows: when a node on the network (i.e., a user with the Grass extension installed) sends a web request to a given website, it returns an encrypted response containing all the data requested by the node. This is the moment when the dataset is born - the moment of origin that needs to be recorded, and also when metadata is recorded. It contains many fields such as session keys, scraped website URLs, target website IP addresses, transaction timestamps, and of course the data itself. Thanks to these necessary details and datasets with clear website sources, AI models can be trained correctly and faithfully.

The ZK processor enables data requiring on-chain settlement to remain hidden from Solana validators. Furthermore, the high volume of future Web requests to be executed on Grass will exceed L1’s throughput capacity. Grass will soon scale to handle tens of millions of Web requests per minute, with metadata from each request requiring on-chain settlement. Without the ZK Processor first generating proofs and batch processing, it would be impossible to submit these transactions to L1. Therefore, Rollups are the only viable method to achieve the planned objectives.

In addition to recording the source website, Grass tracks the network path through which data is routed. Each Grass Node contributing bandwidth to web scraping is rewarded proportionally to its contributions. Higher contributions in terms of data volume or value lead to greater rewards. This approach incentivizes participation, especially in high-demand regions, expanding network capacity and data coverage. As the network grows, Grass’s data repository expands, creating more opportunities to provide diverse datasets for AI labs and further driving network growth.

Grass Node Operation and Security

Grass nodes operate free of charge, serving as gateways that connect the network to the broader internet. Node operators (i.e., application users) are rewarded based on the volume of traffic relayed through their nodes. Their rewards also depend on reputation scores and regional demand for network traffic.

Grass nodes have two primary functions: to relay traffic requested by clients and directed by validator (i.e., web requests); and to return encrypted web server responses to the designated routers.

The systems supported by the node are shown in the figure above, and the process of running the node is also very simple: create an account, download the Grass desktop application, and connect to the network.

Once connected, nodes automatically register on the network. Operators maintain network uptime, enabling nodes to forward requests to public web servers. Each request sent to a Grass Node is encrypted into a data packet, which only provides routing instructions to its destination. Network requests are authenticated via digital signatures from all involved parties, ensuring their legitimacy and verifying whether the request should be forwarded to the target web server. This encryption process prevents data tampering and allows Validators to accurately measure node reputation.

Node reputation score mainly includes the following points:

Data Integrity: Evaluate whether the data is complete and whether the data set contains all necessary data points required for the intended use case.

Consistency: Check data consistency across different datasets or within the same dataset over time.

Timeliness: Measure whether data is up to date when needed.

Availability:Evaluate the uptime and reliability of node performance.

Grass ensures that user devices are not accessed or monitored and does not track user activity. Grass routes only public internet traffic through users’ IP addresses, with no access to personal data. All aggregated data originates solely from public web resources.

To protect users, Grass encrypts bandwidth and collaborates with AppEsteem, a leading cybersecurity compliance auditing firm. AppEsteem monitors Grass products 24/7 for vulnerabilities, leaks, backdoors, or malware. Achieving AppEsteem certification, which is highly respected in the industry, ensures Grass is whitelisted by top anti-malware programs, including Avast, Microsoft Defender, McAfee, and AVG.

Grass Token Functionality

Grass token holders can participate in the network via:

Transactions and Buybacks: After decentralization, Grass tokens will support network scraping transactions, dataset purchases, and real-time context retrieval (LCR).

Staking and Rewards: Grass tokens can be staked in Routers to facilitate network traffic and earn rewards for contributing to network security.

Governance: Token holders can propose and vote on network improvements, partnerships, and incentive mechanisms for stakeholders.

According to Dune, Grass offers an annual staking yield of approximately 45%, with about 33% of Grass tokens (over 26 million) currently staked.

Router Staking and Rewards

Routers act as decentralized hubs connecting all network nodes and managing incoming and outgoing Validator web requests. Router operators are incentivized, with rewards proportional to the staked amount delegated to each Router. All traffic relayed through Routers is encrypted and metered to ensure security and performance.

Currently, staking levels for various Routers are illustrated in the chart above. Users can stake Grass tokens with their chosen Router to earn rewards, each charging a different commission rate.

For example, DBunker has staked approximately 1.43 million Grass tokens with a minimum staking period of 7 days and a commission rate of 10% (Data source: https://www.grassfoundation.io/stake/delegations). Users can stake Grass tokens to a Router via the Grass website by simply clicking “STAKE,” connecting their wallets, and earning Router staking rewards.

Conclusion

Grass aims to build a fair, open, and decentralized data layer to address the ethical and quality challenges of internet data extraction while opposing monopolistic data control by large corporations. Its technical architecture, including the Grass Data Rollup, integrates metadata recording mechanisms for dataset origins. These ZK proofs stored on the L1 settlement layer ensure transparency and reward node operators proportionally to their contributions, driving network expansion.

Positioned at the intersection of crypto and AI, Grass stands out as a decentralized alternative to centralized AI data providers. With its innovative architecture and strong focus on Web3, Grass offers promising growth potential as it builds a just and open data layer for AI protocols and companies.

Disclaimer:

  1. This article is reproduced from [medium]. Forward the Original Title: Grass: Monetizing Internet Resources Through Decentralized Data Sharing. The copyright belongs to the original author [EbunkerChinese]. If you have any objection to the reprint, please contact Gate Learn Team, the team will handle it as soon as possible according to relevant procedures.
  2. Disclaimer: The views and opinions expressed in this article represent only the author’s personal views and do not constitute any investment advice.
  3. The Gate Learn team translated the article into other languages. Copying, distributing, or plagiarizing the translated articles is prohibited unless mentioned.
Start Now
Sign up and get a
$100
Voucher!