Hack VC Partner: 8 Real Advantages of AI+Crypto

AI+ Crypto is one of the frontier areas that has attracted much attention in the cryptocurrency market recently, such as decentralized AI training, GPU DePINs, and anti-censorship AI models.

Behind these dazzling developments, we can't help but ask: is this really a technological breakthrough or just riding the wave? This article will unveil the mystery for you, analyze the intersection of encryption and AI, discuss the real challenges and opportunities, and reveal which promises are hollow and which are practical.

Scenario #1: Decentralized AI Training

The problem with on-chain AI training is the need for high-speed communication and coordination between GPUs, as neural networks require reverse propagation during training. Nvidia has two innovations for this (NVLink and InfiniBand). These technologies make GPU communication super fast, but they are limited to local technologies and only apply to GPU clusters within a single data center (50+ gigabits per second).

If a decentralized network is introduced, the speed will suddenly slow down by several orders of magnitude due to network latency and increased bandwidth, which is impossible for AI training use cases compared to the throughput obtained from Nvidia's high-speed interconnection within the data center.

Please note that the following innovations may bring hope for the future:

Large-scale distributed training on InfiniBand, as NVIDIA itself is supporting distributed non-local training on InfiniBand through the NVIDIA Collective Communications Library. However, it is still in its infancy, so the adoption metrics are yet to be determined. The physical law bottleneck still exists in terms of distance, so local training on InfiniBand is still much faster.
Some new research has been published on decentralized training, which reduces communication synchronization time and may make decentralized training more practical in the future.
Intelligent sharding and scheduling of model training helps improve performance. Similarly, new model architectures may be specifically designed for future distributed infrastructures (Gensyn is researching in these areas).

The data part of the training is also challenging. Any AI training process involves dealing with a large amount of data. Typically, the model is trained on a centralized secure data storage system with high scalability and performance. This requires the transmission and processing of TB of data, and this is not a one-time cycle. Data is often full of noise and errors, so it must be cleaned and converted into usable formats before training the model. This stage involves repetitive tasks of standardization, filtering, and processing missing values. All of these face serious challenges in a decentralized environment.

The data portion of the training is also iterative, which is not compatible with Web3. OpenAI went through several k iterations to achieve its results. In an AI team, the most basic task scenarios for a data scientist include defining goals, preparing data, analyzing and organizing data to extract important insights, and make them suitable for modeling. Then, develop a machine learning model to solve the defined problem and validate its performance using a test dataset. The process is iterative: if the current model isn't performing as expected, the expert returns to the data collection or model training phase to improve the results. Imagine how difficult it would be for state-of-the-art existing frameworks and tools to adapt in Web3 if this process took place in a decentralization environment.

Another issue with on-chain training of AI models is that, compared to inference, this market will be much less interesting. Currently, the training of large AI language models requires a large amount of GPU computing resources. In the long run, inference will become the primary application scenario for GPUs. Imagine how many AI large language models need to be trained to meet global demand, compared to the number of customers using these models, which one is more?

Vision #2: Using excessively redundant AI inference calculations to achieve Consensus

Another challenge regarding encryption and AI is verifying the accuracy of AI reasoning, as you cannot fully trust a single centralized entity to perform reasoning operations, and there is a potential risk of improper behavior by nodes. This challenge does not exist in Web2 AI, as there is no decentralized consensus system.

The solution is redundant computation, allowing multiple nodes to repeat the same AI inference operation, so that it can run in a trusted environment to avoid single point of failure.

However, the problem with this approach is that high-end AI chips are extremely scarce. The long waiting time for high-end NVIDIA chips has led to a pump in prices for several years. If you require AI inference to be repeatedly executed on multiple nodes, the high cost will increase exponentially, which is not feasible for many projects.

Scenario #3: Recent AI Use Cases Specific to Web3

Some people suggest that Web3 should have its own unique AI use cases specifically for Web3 clients. This could be a Web3 protocol that uses AI to risk score DeFi pools, a Web3 wallet that advises users on new protocols based on wallet history, or a Web3 game (NPC) that uses AI to control non-player characters.

At present, this is an emerging market (in the short term), where use cases are still in the exploratory stage. Some challenges include:

Due to the nascent stage of market demand, there are relatively few potential AI transactions required for Web3 native use cases.
The customer base is smaller, and compared with Web2 customers, Web3 customers are orders of magnitude lower, so the market is less decentralized.
The clients themselves are not very stable because they are early-stage companies with limited funds, and some early-stage companies may disappear over time. Web3 AI service providers serving Web3 clients may need to reacquire some customer base to replace those that have disappeared, making expanding the business extremely challenging.

In the long run, we are very optimistic about the native AI use cases of Web3, especially as AI agents become more common. We envision that any specific Web3 user in the future will have a large number of AI agents to help them complete tasks.

Scenario #4: Consumer-grade GPU DePIN

There are many decentralized AI computing networks that rely on consumer-grade GPUs rather than data centers. Consumer-grade GPUs are ideal for low-end AI inference tasks or consumer use cases with flexible latency, throughput, and reliability. However, for serious enterprise use cases (which make up the majority of the important market), customers require a more reliable network compared to home machines, and if they have more complex inference tasks, they typically need higher-end GPUs. Data centers are more suitable for these more valuable customer use cases.

Please note that we believe consumer-grade GPUs are suitable for demonstrations and for individuals and startups that can tolerate lower reliability. However, these customers have lower value, so we believe that DePINs customized for Web2 enterprises will be more valuable in the long run. Therefore, the GPU DePIN project has evolved from early primary use of consumer-grade hardware to situations with A100/H100 and cluster-level availability.

Reality - Practical Use Cases of Cryptocurrency x AI

Now let's discuss the use cases that can provide real benefits. These are the real victories, where cryptocurrency x AI can add significant value.

Real Benefit #1: Serving Web2 Customers

McKinsey estimates that, in the analysis of 63 use cases, generative AI could increase annual revenue by the equivalent of $26 trillion to $44 trillion—compared to the UK's 2021 GDP of $31 trillion. This would increase the influence of AI by 15% to 40%. If we consider embedding generative AI into other task software currently used beyond the use cases, the estimated impact would roughly double.

If you do the math based on the above estimates, this means that the total market value of all AI (beyond generative AI) worldwide could reach trillions of dollars. In comparison, the total value of all cryptocurrencies today, including Bitcoin and all altcoins, is only around $2.7 trillion. So let's face it: the majority of customers who need AI in the short term will be Web2 customers, as the Web3 customers who really need AI will only account for a small portion of this $2.7 trillion market (considering that BTC is in this market and Bitcoin itself does not need/use AI).

Web3 AI use cases are just beginning, and it is currently unclear how large the market size is. But one thing is certain - in the foreseeable future, it only accounts for a small part of the Web2 market. We believe that Web3 AI still has a bright future, but this only means that the most powerful application of Web3 AI at present is serving Web2 customers.

Examples of Web2 clients who can benefit from Web3 AI include:

Building a vertically specific software company centered around AI from scratch (such as Cedar.ai or Observe.ai)
Large enterprises (such as Netflix) that fine-tune models for their own purposes
Rapidly rising AI providers (such as Anthropic)
Integrate AI into existing product software companies (e.g. Canva)

This is a relatively stable customer role, as customers are usually large in scale and valuable. They are less likely to go out of business quickly, and they represent a huge potential customer for AI services. Web3 AI services serving Web2 customers will benefit from these stable customer groups.

But why do Web2 clients want to use the Web3 stack? The next part of this article explains this situation.

Real Benefit #2: Reduce GPU Usage Costs through GPU DePIN

GPU DePIN aggregates underutilized GPU computing power (primarily from data centers) and makes it available for AI inference. An analogy for this issue is 'Airbnb for GPUs'.

The reason we are excited about GPU DePIN is, as mentioned above, the shortage of NVIDIA chips, and there are currently wasted GPU cycles available for AI inference. These hardware owners have sunk costs and are currently not fully utilizing the devices, so compared to the current situation, these partial GPUs can be provided at a much lower cost, because this actually "finds money" for hardware owners.

Examples include:

AWS Machines. If you want to rent an H100 from AWS today, you must commit to a 1-year lease due to limited market supply. This can lead to waste since you may not use the GPU every day of the year or every week of the year.
Filecoin mining hardware. Filecoin has a large supply of subsidies but not a large amount of actual demand. Filecoin has never found a real product market fit, so Filecoin miners are at risk of bankruptcy. These machines are equipped with GPUs that can be repurposed for low-end AI inference tasks.
ETH Mining hardware. When Ethereum transitions from PoW to PoS, this quickly releases a large amount of hardware that can be repurposed for AI inference.

Note that not all GPU hardware is suitable for AI inference. One obvious reason for this is that older GPUs do not have the GPU memory required for LLMs, although there have been some interesting innovations that can help in this regard. For example, Exabits' technology can load active neurons into GPU memory and inactive neurons into CPU memory. They predict which neurons need to be active / inactive. This makes low-end GPUs capable of handling AI workloads, even with limited GPU memory. This effectively makes low-end GPUs more useful for AI inference.

Web3 AI DePINs needs to develop its products over time and provide enterprise-level services, such as single sign-on, SOC 2 compliance, service-level agreements (SLA), etc. This is similar to the services provided by current cloud service providers to Web2 customers.

Real Benefit #3: Anti-censorship model to avoid OpenAI self-censorship

There has been a lot of discussion about the AI review system. For example, Turkey temporarily banned OpenAI (later, OpenAI improved its compliance, so they changed their approach). We believe that national-level review systems are uninteresting because countries need to adopt AI to maintain their competitiveness.

OpenAI also conducts self-review. For example, OpenAI does not deal with NSFW content. OpenAI does not predict the next presidential election either. We believe AI use cases are not only interesting but also have a huge market, but OpenAI will not touch that market for political reasons.

Open source is a great solution because Github repositories are not influenced by shareholders or boards. Venice.ai is an example that promises to protect privacy and operate in an anti-censorship manner. Web3 AI can effectively elevate its level by providing support for these open source software (OSS) models on low-cost GPU clusters to perform inference. It is for these reasons that we believe OSS + Web3 is the ideal combination to pave the way for censorship-resistant AI.

Real Benefit #4: Avoid sending personal identity information to OpenAI

Large enterprises have concerns about the privacy of their internal data. For these customers, it may be difficult to trust third-party OpenAI to have access to this data.

In Web3, for these enterprises, the sudden appearance of their internal data on the decentralized network may seem more worrisome (on the surface). However, there are innovations in privacy-enhancing technologies for AI: 01928374656574839201

Trusted Execution Environment (TEE), such as Super Protocol

Fully Homomorphic Encryption (FHE), such as Fhenix.io (portfolio company of Hack VC) or Inco Network (supported by Zama.ai), as well as Bagel's PPML

These technologies are still evolving, and performance is continuously improving through the upcoming Zero Knowledge (ZK) and FHE ASICs. However, the long-term goal is to protect enterprise data when fine-tuning models. With the emergence of these protocols, Web3 may become a more attractive place for privacy-protected AI computations.

Real Benefit #5: Harnessing the latest innovations of the open source model

Over the past few decades, open source software has been eating away at the market share of proprietary software. We consider LLM to be a form of proprietary software that is enough to undermine OSS. Notable challengers include Llama, RWKV, and Mistral.ai. Over time, this list will undoubtedly continue to grow (a more comprehensive list can be found on Openrouter.ai). By leveraging Web3 AI (supported by OSS models), people can harness these new innovations to innovate.

We believe that over time, the combination of an open-source global development community and cryptocurrency incentives can drive rapid innovation in the open-source model, as well as in the creation of agents and frameworks built upon it. An example of an AI agent protocol is Theoriq. Theoriq utilizes the OSS model to create a composable network of AI agents that can be assembled to create higher-level AI solutions.

The reason we are confident about this is because in the past, over time, most 'developer software' innovations have been gradually surpassed by OSS. Microsoft used to be a proprietary software company, and now they are the top contributor to Github. There is a reason for this, if you look at examples like Databricks, PostGresSQL, MongoDB, and other companies disrupting proprietary databases, it is a compelling precedent for OSS disrupting the entire industry.

However, there is also a problem with this. One of the challenges of open source large language models (OSS LLMs) is that OpenAI has started signing paid data licensing agreements with organizations such as Reddit and The New York Times. If this trend continues, open source large language models may face financial barriers to accessing data, making it more difficult to compete. Nvidia may further strengthen its investment in confidential computing as a boost to secure data sharing. Time will reveal the development of all this.

Real Benefit #6: Achieve consensus through high cost reduction random sampling or through ZK proof

One of the challenges of Web3 AI inference is validation. Assuming that validators have the opportunity to deceive their results to earn fees, validating inferences is an important measure. Please note that this cheating behavior has not actually occurred because AI inference is still in its early stages, but it is inevitable unless measures are taken to suppress this behavior.

The standard Web3 method is for multiple validators to repeat the same operation and compare the results. As mentioned earlier, a prominent challenge faced by this problem is that the cost of AI inference is very expensive due to the current shortage of high-end Nvidia chips. Considering that Web3 can provide lower-cost inference through underutilized GPU DePIN, redundant computation will severely weaken the value proposition of Web3.

A more promising solution is to perform off-chain AI inference computation with ZK proofs. In this scenario, concise ZK proofs can be verified to determine whether the model has been trained correctly or if the inference is running correctly (referred to as zkML). Examples include Modulus Labs and ZKonduit. Due to the computational intensity of ZK operations, the performance of these solutions is still in the early stages. However, we expect the situation to improve with the imminent release of ZK hardware ASICs.

Even more promising is a sampling-based AI inference method called "Optimistic". In this model, only a small portion of the validator's generated results needs to be verified, but the economic cost of reduction is set high enough to create a strong economic deterrent to cheating by validators if discovered. In this way, you can save redundant computation.

Another promising idea is watermark and fingerprint solutions, such as the solution proposed by Bagel Network. This is similar to the mechanism provided by Amazon Alexa to ensure the quality of AI models inside its millions of devices.

Real Benefit #7: Save costs through OSS (OpenAI's profit)

The next opportunity Web3 brings to AI is cost democratization. So far, we have discussed saving GPU costs through DePIN. However, Web3 also provides an opportunity to save the profit margin of centralized Web2 AI services (such as OpenAI, which had annual revenue of over 1 billion USD at the time of writing). These cost savings come from the fact that using OSS models instead of proprietary models can achieve additional savings, as model creators are not attempting to make a profit.

Many OSS models will remain completely free, providing customers with the best economic benefits. However, some OSS models may also be trying these monetization methods. Consider that only 4% of all models on Hugging Face are trained by companies with budgets to subsidize the models. The remaining 96% of the models are trained by the community. This group (96% of Hugging Face) has basic real costs (including computing and data costs). Therefore, these models will need to be monetized in some way.

There are some proposals to realize the tokenization of the open-source software model. One of the most interesting is the concept of "Initial Model Issuance", which tokenizes the model itself, reserves a portion of tokens for the team, and directs some of the model's future income to token holders, although there are certainly some legal and regulatory barriers in this regard.

Other OSS models will attempt to monetize by using [01928374656574839201]. Please note that if this becomes a reality, OSS models may begin to resemble more and more their Web2 profit models. In fact, the market will be divided into two parts, with some models still completely free.

Real Benefit #8: Decentralized Data Source

One of the biggest challenges AI faces is finding the right data to train models. We previously mentioned the challenges of decentralized AI training. But what about using decentralized networks to obtain data (which can then be used for training elsewhere, even in traditional Web2 environments)?

This is exactly what start-ups like Grass are doing. Grass is a decentralized network of 'data fetchers' who contribute the idle computing power of machines to data sources for training AI models. Assuming, in terms of scale, the data source can be better than any company's internal data source work due to the powerful force of large incentive node networks. This includes not only acquiring more data, but also acquiring data more frequently to make the data more relevant and up-to-date. In fact, it is impossible to stop the decentralized data fetch army because they are essentially decentralized and do not reside in a single IP address. They also have a network that can clean and standardize data so that the data is useful after it is fetched.

After obtaining the data, you also need to store it on-chain and use the LLMs generated from the data.

Please note that the role of future data in Web3 AI may change. Currently, the status of LLMs is to use data pre-training models and improve them with more data over time. However, because the data on the Internet is constantly changing, these models are always slightly outdated. Therefore, the response inferred by LLM is slightly inaccurate.

The future development direction may be a new paradigm - "real-time" data. The concept is that when a large language model (LLM) is asked a reasoning question, the LLM can transmit and inject data through prompts, and this data is collected in real time from the Internet. In this way, the LLM can use the latest data. Grass is researching this part.

Special thanks to the following individuals for their feedback and assistance on this article: Albert Castellana, Jasper Zhang, Vassilis Tziokas, Bidhan Roy, Rezo, Vincent Weisser, Shashank Yadav, Ali Husain, Nukri Basharuli, Emad Mostaque, David Minarsch, Tommy Shaughnessy, Michael Heinrich, Keccak Wong, Marc Weinstein, Phillip Bonello, Jeff Amico, Ejaaz Ahamadeen, Evan Feng, JW Wang.