The connection between AI and Crypto has shown distinct undulations. Since AlphaGo defeated human professional Go players in 2016, attempts to combine the two, such as the spontaneous emergence of projects like Fetch.AI, have been witnessed in the crypto world. With the advent of GPT-4 in 2023, the trend of AI + Crypto has resurged, exemplified by the issuance of WorldCoin. Humanity seems poised to enter a utopian era where AI is responsible for productivity, and Crypto handles distribution.
This sentiment reached its peak after OpenAI launched the Sora application for text-to-video synthesis. However, emotions often involve irrational elements. Li Yizhou, for instance, seems to be part of the misunderstood segment, exemplified by:
In this article, the focus will be on the benefits Crypto can bring to AI, as current Crypto projects emphasizing AI applications are mostly seen as marketing gimmicks and are not conducive to our discussion.
For a long time, the focal point in discussions about AI has been whether the “emergence” of artificial intelligence will lead to the creation of sentient beings resembling those in “The Matrix” or a silicon-based civilization. Concerns about the interaction between humans and AI technologies have persisted, with recent examples such as the advent of Sora and earlier instances like GPT-4 (2023), AlphaGo (2016), and IBM’s Deep Blue defeating a chess world champion in 1997.
While such concerns have not materialized, let’s relax and briefly outline the mechanism behind AI.
Starting from linear regression, essentially a simple linear equation, such as the weight loss mechanism by Jia Ling, a famous Chinese actor, we can make the following generalization. Here, x and y represent the relationship between calorie intake and weight, indicating that eating more naturally leads to gaining weight, and if you want to lose weight, you should eat less.
However, this approach brings about some issues. Firstly, there are physiological limits to human height and weight, and it’s unlikely to encounter 3-meter giants or thousand-kilogram ladies. Therefore, considering situations beyond these limits lacks practical significance. Secondly, simply eating less and exercising more does not adhere to the scientific principles of weight loss and can, in severe cases, harm the body.
We introduce the Body Mass Index (BMI), which measures the relationship between weight and height by dividing weight by the square of height. Through three factors—eating, sleeping, and exercising—to assess the relationship between height and weight, we now need three parameters and two outputs. Linear regression is evidently insufficient, giving rise to neural networks. As the name suggests, neural networks mimic the structure of the human brain, with the possibility that more thinking leads to more rationality. Increasing the frequency and depth of thinking, known as deep learning (I’m making a somewhat loose analogy here), enables more thorough consideration before taking action.
Brief Overview of the Development History of AI Algorithms
However, the increase in the number of layers is not limitless; there is still a ceiling. Once a critical threshold is reached, the effectiveness may decline. Therefore, it becomes essential to understand the relationship between existing information in a more reasonable way. For example, a profound understanding of the more nuanced relationship between height and weight, discovering previously unnoticed factors, or Jia Ling finding a top coach but hesitating to directly express her desire for weight loss.
In such scenarios, Jia Ling and the coach form opponents in encoding and decoding, conveying meanings back and forth that represent the true intentions of both parties. However, unlike the straightforward statement “I want to lose weight, here’s a gift for the coach,” the true intentions of both sides are hidden behind the “meaning.”
We notice a fact: if the number of iterations between the two parties is sufficient, the meanings of each communication become easier to decipher.
If we extend this model, it represents what is colloquially known as a Large Language Model (LLM), examining contextual relationships between words and sentences. Currently, large models have expanded to delve into scenarios such as images and videos.
In the spectrum of AI, whether it’s a simple linear regression or an extremely complex Transformer, they are all algorithms or models. In addition to these, there are two essential factors: computing power and data.
Description: Brief development history of AI, Source: https://ourworldindata.org/brief-history-of-ai
Simply put, AI is a machine that processes data, performs computations, and produces results. However, compared to physical entities like robots, AI is more virtual. In terms of computing power, data, and models, the current operational process in Web2 commercialization is as follows:
Work Process of AI
As mentioned earlier, AI applications have a wide range of domains, such as the code correction Vitalik mentioned, which has already been put into use. Looking from a different perspective, Crypto’s contribution to AI primarily focuses on non-technical areas, such as decentralized data markets, decentralized computing power platforms, etc. There have been some experiments with decentralized Large Language Models (LLMs). However, it’s crucial to note that analyzing Crypto code with AI and running large-scale AI models on the blockchain are fundamentally different. Incorporating some Crypto elements into AI models can hardly be considered a perfect integration.
Currently, Crypto excels in production and incentives. It is unnecessary to forcefully change the production paradigm of AI with Crypto. The rational choice is to integrate Crypto into AI workflows and empower AI with Crypto. Here are some potential integration I have summarized:
These four aspects are potential scenarios that I think Crypto can empower AI. AI is a versatile tool, and the areas and projects of AI for Crypto are not further discussed here; you can explore them on their own.
It can be observed that Crypto currently mainly plays a role in encryption, privacy protection, and economic design. The only technical integration attempt is zkML. Here, let’s brainstorm a bit: if, in the future, Solana TPS can truly reach 100,000+, and if the combination of Filecoin and Solana is perfect, could we create an on-chain LLM environment? This could potentially establish a real on-chain AI, altering the current unequal relationship where Crypto is integrated into AI.
As is known to all, the NVIDIA RTX 4090 graphics card is a valuable commodity that is currently hard to obtain in a certain East Asian country. Even more severe is that individuals, small companies, and academic institutions have also encountered a graphics card crisis. After all, large commercial companies are the big spenders. If a third path could be opened outside of personal purchases and cloud providers, it would clearly have practical business value, breaking away from purely speculative purposes. The logical approach of Web3 for AI should be, “If Web3 is not used, the project cannot be sustained.”
AI Workflow from the Web3 Perspective
Source of data: Grass and DePIN Automotive Ecosystem
Grass, introduced by Wynd Network, is a marketplace for selling idle bandwidth. Grass serves as an open network for data acquisition and distribution, differentiating itself from simple data collection and sales. Grass has functions for cleaning and validating data to navigate the increasingly closed network environment. Beyond that, Grass aims to directly interface with AI models, providing them with readily usable datasets. AI datasets require professional handling, including extensive manual fine-tuning to meet the specific needs of AI models.
Expanding on this, Grass addresses the issue of data sales, while Web3’s DePIN sector can produce the data required by AI. This sector primarily focuses on the automatic driving of vehicles. Traditionally, autonomous driving required companies to accumulate corresponding data. However, projects like DIMO and Hivemapper operate directly on vehicles, collecting a growing amount of driving information and road data.
In previous autonomous driving scenarios, technology for vehicle recognition and high-precision maps were essential. Information such as high-precision maps has been accumulated by companies like NavInfo, creating industry barriers. If newcomers leverage Web3 data, they might have an opportunity to overtake competitors on the bend.
Data Preprocessing: Liberating Humans Enslaved by AI
Artificial intelligence can be divided into two parts: manual annotation and intelligent algorithms. In third-world regions like Kenya and the Philippines, where the value curve for manual annotation is lowest, people are responsible for this task. Meanwhile, AI preprocessing companies in Europe and the United States take the lion’s share of the income, subsequently selling it to AI research and development enterprises.
With the advancement of AI, more companies are eyeing this business. In the face of competition, the unit price for data annotation continues to decrease. This business mainly involves labeling data, similar to tasks such as recognizing captchas, with no technical threshold, and even ultra-low prices like 0.01 RMB.
Source: https://aim.baidu.com/product/0793f1f1-f1cb-4f9f-b3a7-ef31335bd7f0
In this scenario, Web3 data annotation platforms such as Public AI have a practical business market. They link AI enterprises with data annotation workers, replacing a simple business low-price competition model with an incentive system. However, it’s essential to note that mature companies like Scale AI guarantee reliable quality in annotation technology. For decentralized data annotation platforms, controlling quality and preventing abuse are absolute necessities. Essentially, this represents a C2B2B enterprise service, where sheer data scale and quantity alone cannot convince enterprises.
Hardware Freedom: Render Network and Bittensor
It should be clarified that, unlike Bitcoin mining rigs, there is currently no dedicated Web3 AI hardware. Existing computing power and platforms are transformed from mature hardware with added Crypto incentive layers, essentially falling under the DePIN sector. However, since it differs from data source projects, it’s included in the AI workflow here.
For the definition of DePIN, please refer to the article I wrote before: DePIN before Helium: Exploring Bitcoin, Arweave and STEPN
Render Network is a long established project that wasn’t initially designed for AI. It began operations in 2017, focusing on rendering, as suggested by its name. At that time, GPUs weren’t in demand, but market opportunities gradually emerged. The GPU market, especially high-end GPUs monopolized by NVIDIA, hindered the entry of rendering, AI, and metaverse users due to exorbitant prices. If a channel could be built between demand and supply, an economic model similar to shared bicycles might have a chance to be established.
Moreover, GPU resources don’t require the actual transfer of hardware; they can be allocated using software resources. Worth mentioning is that Render Network switched to the Solana ecosystem in 2023, abandoning Polygon. The move to Solana, even before its resurgence, has proven to be a correct decision over time. For GPU usage and distribution, a high-speed network is a crucial requirement.
If Render Network can be considered an established project, Bittensor is currently gaining momentum.
BitTensor is built on top of Polkadot, with the goal of training AI models through economic incentives. Nodes compete to train AI models with minimal error or maximum efficiency, resembling classic on-chain processes in Crypto projects. However, the actual training process still requires NVIDIA GPUs and traditional platforms, making it similar to competition platforms like Kaggle.
zkML and UBI: Worldcoin’s Dual Aspects
Zero-Knowledge Machine Learning (zkML) introduces zk technology into the AI model training process to address issues like data leaks, privacy failures, and model verification. The first two are easy to understand – zk-encrypted data can still be trained without leaking personal or private information.
Model verification refers to evaluating closed-source models. With zk technology, a target value can be set, allowing closed-source models to prove their capabilities through result verification without disclosing the calculation process.
Worldcoin not only envisioned zkML early on but also advocates for Universal Basic Income (UBI). In its vision, future AI productivity will far exceed human demand limits. The real challenge is the fair distribution of AI benefits, and the UBI concept is to be shared globally through the $WLD token, requiring real-person biometric recognition to adhere to fairness principles.
Of course, zkML and UBI are still in the early experimental stages, but they are intriguing developments that I will continue to follow closely.
The development of AI, represented by Transformer and Large Language Models (LLMs), is gradually facing bottlenecks, similar to linear regression and neural networks. It’s not feasible to indefinitely increase model parameters or data volume, as the marginal returns will diminish.
AI might be the seed player that emerges with wisdom, but the hallucination problem is currently severe. It can be observed that the belief that Crypto can change AI is a form of confidence and a standard hallucination. While the addition of Crypto might not technically solve hallucination problems, it can at least change some aspects from a perspective of fairness and transparency.
References:
The connection between AI and Crypto has shown distinct undulations. Since AlphaGo defeated human professional Go players in 2016, attempts to combine the two, such as the spontaneous emergence of projects like Fetch.AI, have been witnessed in the crypto world. With the advent of GPT-4 in 2023, the trend of AI + Crypto has resurged, exemplified by the issuance of WorldCoin. Humanity seems poised to enter a utopian era where AI is responsible for productivity, and Crypto handles distribution.
This sentiment reached its peak after OpenAI launched the Sora application for text-to-video synthesis. However, emotions often involve irrational elements. Li Yizhou, for instance, seems to be part of the misunderstood segment, exemplified by:
In this article, the focus will be on the benefits Crypto can bring to AI, as current Crypto projects emphasizing AI applications are mostly seen as marketing gimmicks and are not conducive to our discussion.
For a long time, the focal point in discussions about AI has been whether the “emergence” of artificial intelligence will lead to the creation of sentient beings resembling those in “The Matrix” or a silicon-based civilization. Concerns about the interaction between humans and AI technologies have persisted, with recent examples such as the advent of Sora and earlier instances like GPT-4 (2023), AlphaGo (2016), and IBM’s Deep Blue defeating a chess world champion in 1997.
While such concerns have not materialized, let’s relax and briefly outline the mechanism behind AI.
Starting from linear regression, essentially a simple linear equation, such as the weight loss mechanism by Jia Ling, a famous Chinese actor, we can make the following generalization. Here, x and y represent the relationship between calorie intake and weight, indicating that eating more naturally leads to gaining weight, and if you want to lose weight, you should eat less.
However, this approach brings about some issues. Firstly, there are physiological limits to human height and weight, and it’s unlikely to encounter 3-meter giants or thousand-kilogram ladies. Therefore, considering situations beyond these limits lacks practical significance. Secondly, simply eating less and exercising more does not adhere to the scientific principles of weight loss and can, in severe cases, harm the body.
We introduce the Body Mass Index (BMI), which measures the relationship between weight and height by dividing weight by the square of height. Through three factors—eating, sleeping, and exercising—to assess the relationship between height and weight, we now need three parameters and two outputs. Linear regression is evidently insufficient, giving rise to neural networks. As the name suggests, neural networks mimic the structure of the human brain, with the possibility that more thinking leads to more rationality. Increasing the frequency and depth of thinking, known as deep learning (I’m making a somewhat loose analogy here), enables more thorough consideration before taking action.
Brief Overview of the Development History of AI Algorithms
However, the increase in the number of layers is not limitless; there is still a ceiling. Once a critical threshold is reached, the effectiveness may decline. Therefore, it becomes essential to understand the relationship between existing information in a more reasonable way. For example, a profound understanding of the more nuanced relationship between height and weight, discovering previously unnoticed factors, or Jia Ling finding a top coach but hesitating to directly express her desire for weight loss.
In such scenarios, Jia Ling and the coach form opponents in encoding and decoding, conveying meanings back and forth that represent the true intentions of both parties. However, unlike the straightforward statement “I want to lose weight, here’s a gift for the coach,” the true intentions of both sides are hidden behind the “meaning.”
We notice a fact: if the number of iterations between the two parties is sufficient, the meanings of each communication become easier to decipher.
If we extend this model, it represents what is colloquially known as a Large Language Model (LLM), examining contextual relationships between words and sentences. Currently, large models have expanded to delve into scenarios such as images and videos.
In the spectrum of AI, whether it’s a simple linear regression or an extremely complex Transformer, they are all algorithms or models. In addition to these, there are two essential factors: computing power and data.
Description: Brief development history of AI, Source: https://ourworldindata.org/brief-history-of-ai
Simply put, AI is a machine that processes data, performs computations, and produces results. However, compared to physical entities like robots, AI is more virtual. In terms of computing power, data, and models, the current operational process in Web2 commercialization is as follows:
Work Process of AI
As mentioned earlier, AI applications have a wide range of domains, such as the code correction Vitalik mentioned, which has already been put into use. Looking from a different perspective, Crypto’s contribution to AI primarily focuses on non-technical areas, such as decentralized data markets, decentralized computing power platforms, etc. There have been some experiments with decentralized Large Language Models (LLMs). However, it’s crucial to note that analyzing Crypto code with AI and running large-scale AI models on the blockchain are fundamentally different. Incorporating some Crypto elements into AI models can hardly be considered a perfect integration.
Currently, Crypto excels in production and incentives. It is unnecessary to forcefully change the production paradigm of AI with Crypto. The rational choice is to integrate Crypto into AI workflows and empower AI with Crypto. Here are some potential integration I have summarized:
These four aspects are potential scenarios that I think Crypto can empower AI. AI is a versatile tool, and the areas and projects of AI for Crypto are not further discussed here; you can explore them on their own.
It can be observed that Crypto currently mainly plays a role in encryption, privacy protection, and economic design. The only technical integration attempt is zkML. Here, let’s brainstorm a bit: if, in the future, Solana TPS can truly reach 100,000+, and if the combination of Filecoin and Solana is perfect, could we create an on-chain LLM environment? This could potentially establish a real on-chain AI, altering the current unequal relationship where Crypto is integrated into AI.
As is known to all, the NVIDIA RTX 4090 graphics card is a valuable commodity that is currently hard to obtain in a certain East Asian country. Even more severe is that individuals, small companies, and academic institutions have also encountered a graphics card crisis. After all, large commercial companies are the big spenders. If a third path could be opened outside of personal purchases and cloud providers, it would clearly have practical business value, breaking away from purely speculative purposes. The logical approach of Web3 for AI should be, “If Web3 is not used, the project cannot be sustained.”
AI Workflow from the Web3 Perspective
Source of data: Grass and DePIN Automotive Ecosystem
Grass, introduced by Wynd Network, is a marketplace for selling idle bandwidth. Grass serves as an open network for data acquisition and distribution, differentiating itself from simple data collection and sales. Grass has functions for cleaning and validating data to navigate the increasingly closed network environment. Beyond that, Grass aims to directly interface with AI models, providing them with readily usable datasets. AI datasets require professional handling, including extensive manual fine-tuning to meet the specific needs of AI models.
Expanding on this, Grass addresses the issue of data sales, while Web3’s DePIN sector can produce the data required by AI. This sector primarily focuses on the automatic driving of vehicles. Traditionally, autonomous driving required companies to accumulate corresponding data. However, projects like DIMO and Hivemapper operate directly on vehicles, collecting a growing amount of driving information and road data.
In previous autonomous driving scenarios, technology for vehicle recognition and high-precision maps were essential. Information such as high-precision maps has been accumulated by companies like NavInfo, creating industry barriers. If newcomers leverage Web3 data, they might have an opportunity to overtake competitors on the bend.
Data Preprocessing: Liberating Humans Enslaved by AI
Artificial intelligence can be divided into two parts: manual annotation and intelligent algorithms. In third-world regions like Kenya and the Philippines, where the value curve for manual annotation is lowest, people are responsible for this task. Meanwhile, AI preprocessing companies in Europe and the United States take the lion’s share of the income, subsequently selling it to AI research and development enterprises.
With the advancement of AI, more companies are eyeing this business. In the face of competition, the unit price for data annotation continues to decrease. This business mainly involves labeling data, similar to tasks such as recognizing captchas, with no technical threshold, and even ultra-low prices like 0.01 RMB.
Source: https://aim.baidu.com/product/0793f1f1-f1cb-4f9f-b3a7-ef31335bd7f0
In this scenario, Web3 data annotation platforms such as Public AI have a practical business market. They link AI enterprises with data annotation workers, replacing a simple business low-price competition model with an incentive system. However, it’s essential to note that mature companies like Scale AI guarantee reliable quality in annotation technology. For decentralized data annotation platforms, controlling quality and preventing abuse are absolute necessities. Essentially, this represents a C2B2B enterprise service, where sheer data scale and quantity alone cannot convince enterprises.
Hardware Freedom: Render Network and Bittensor
It should be clarified that, unlike Bitcoin mining rigs, there is currently no dedicated Web3 AI hardware. Existing computing power and platforms are transformed from mature hardware with added Crypto incentive layers, essentially falling under the DePIN sector. However, since it differs from data source projects, it’s included in the AI workflow here.
For the definition of DePIN, please refer to the article I wrote before: DePIN before Helium: Exploring Bitcoin, Arweave and STEPN
Render Network is a long established project that wasn’t initially designed for AI. It began operations in 2017, focusing on rendering, as suggested by its name. At that time, GPUs weren’t in demand, but market opportunities gradually emerged. The GPU market, especially high-end GPUs monopolized by NVIDIA, hindered the entry of rendering, AI, and metaverse users due to exorbitant prices. If a channel could be built between demand and supply, an economic model similar to shared bicycles might have a chance to be established.
Moreover, GPU resources don’t require the actual transfer of hardware; they can be allocated using software resources. Worth mentioning is that Render Network switched to the Solana ecosystem in 2023, abandoning Polygon. The move to Solana, even before its resurgence, has proven to be a correct decision over time. For GPU usage and distribution, a high-speed network is a crucial requirement.
If Render Network can be considered an established project, Bittensor is currently gaining momentum.
BitTensor is built on top of Polkadot, with the goal of training AI models through economic incentives. Nodes compete to train AI models with minimal error or maximum efficiency, resembling classic on-chain processes in Crypto projects. However, the actual training process still requires NVIDIA GPUs and traditional platforms, making it similar to competition platforms like Kaggle.
zkML and UBI: Worldcoin’s Dual Aspects
Zero-Knowledge Machine Learning (zkML) introduces zk technology into the AI model training process to address issues like data leaks, privacy failures, and model verification. The first two are easy to understand – zk-encrypted data can still be trained without leaking personal or private information.
Model verification refers to evaluating closed-source models. With zk technology, a target value can be set, allowing closed-source models to prove their capabilities through result verification without disclosing the calculation process.
Worldcoin not only envisioned zkML early on but also advocates for Universal Basic Income (UBI). In its vision, future AI productivity will far exceed human demand limits. The real challenge is the fair distribution of AI benefits, and the UBI concept is to be shared globally through the $WLD token, requiring real-person biometric recognition to adhere to fairness principles.
Of course, zkML and UBI are still in the early experimental stages, but they are intriguing developments that I will continue to follow closely.
The development of AI, represented by Transformer and Large Language Models (LLMs), is gradually facing bottlenecks, similar to linear regression and neural networks. It’s not feasible to indefinitely increase model parameters or data volume, as the marginal returns will diminish.
AI might be the seed player that emerges with wisdom, but the hallucination problem is currently severe. It can be observed that the belief that Crypto can change AI is a form of confidence and a standard hallucination. While the addition of Crypto might not technically solve hallucination problems, it can at least change some aspects from a perspective of fairness and transparency.
References: