Dialogue | 79 basic large-scale models were born in three months. What kind of large-scale models does China need?

Text: Wu Junyu Editor: Xie Lirong

Source: Finance Eleven

Image source: Generated by Unbounded AI

After ChatGPT was released at the end of last year, Chinese companies have released at least 79 basic large models. However, most of the large models are considered by the outside world to have a technical gap with ChatGPT. Large-scale model business is imminent. What kind of large-scale model does China need?

In December 2022, OpenAI, an AI startup company invested by Microsoft, launched a conversational AI ChatGPT. ChatGPT is essentially a GPT language large model independently developed by OpenAI, which contains nearly 180 billion parameters. In February this year, Nvidia CEO Huang Renxun commented that "ChatGPT ushered in the iPhone moment for AI". Huang Renxun believes that large models are lowering the threshold for application development, and all applications are worth redoing with large models.

This is not the words of Huang Renxun's family, everyone sees the opportunity. Beginning in March this year, Chinese companies are also competing to release large-scale model products. These include top companies, such as Baidu's Wenxin model, Ali's Tongyi model, and Tencent's industry model, as well as industry companies such as Xunfei and SenseTime, as well as a number of start-up companies. In May, the China Institute of Scientific and Technological Information under the Ministry of Science and Technology released the "Research Report on China's Artificial Intelligence Large Model Map". According to the report, as of May 28, at least 79 basic large-scale models with parameters above 1 billion have been released in China.

The number of parameters of the model is important. At present, leading companies such as Baidu and Ali announce that the parameter volume of Wenxin and Tongyi is usually at the level of 100 billion, such as the parameter volume of the Wenxin large model is 260 billion. The parameters of large models of other enterprises or start-up companies are usually at the level of 10 billion or 1 billion.

**Although the AI models currently released on the market are all called "big models", the number of parameters is by default considered to be one of the defining factors for large models and small models. **Hou Zhenyu, vice president of Baidu Group, told the Caijing reporter that a model with 1 billion parameters in 2022 is called a large model. But the current large model parameters are often hundreds of billions. Because the "intelligent emergence" effect will appear with more than 100 billion parameters, the generalization ability will be formed, and the universal ability in various scenarios will be formed. The fine-tuned model based on this large model has better industrial application effect.

**The "intelligent emergence" effect refers to the fact that after the model scale and computing power level exceed a certain parameter threshold, the AI effect will no longer be a random probability event. **In the general field, the greater the amount of parameters, the greater the possibility of intelligence generally emerging, and the higher the accuracy of AI. In the dedicated vertical field, it is easier to obtain accurate results after the large-parameter model is cropped and optimized.

Although at least 79 large-scale models have appeared in China, many industry professionals interviewed by Caijing believe that large-scale models require computing power, algorithms, and data accumulation. Due to the shortage of high-performance GPU chips, high hardware procurement costs, and high operating costs, there are very few companies in China that have capital reserves, strategic will, and practical capabilities to run through the commercialization of large models. In the "Hundred Models War", there is indeed a gap between most of the products and ChatGPT.

After the uproar, the big model mania is slowly coming back to reality. More rational thinking is emerging in the large-scale model market at home and abroad-ChatGPT that cannot be commercialized can only be a toy, and a large-scale model that can become an enterprise application has industrial value.

Companies such as Apple, Samsung, and JPMorgan Chase have banned employees from using ChatGPT due to security concerns. On the other hand, the growth and retention of ChatGPT users has also reached a bottleneck. According to data from the website analysis tool SimilarWeb, the traffic growth rates of ChatGPT from January to May were 131.6%, 62.5%, 55.8%, 12.6%, and 2.8%. In early June, a Morgan Stanley survey showed that only 19% of respondents said they had used ChatGPT, and only 4% said they relied on ChatGPT.

Hou Zhenyu said, "In March of this year, when customers first started talking to us about the needs of large-scale models, they were all using their imaginations, asking for more sci-fi. But after April, the limitations of large-scale models were revealed, and everyone slowed down. Slowly saw more actual needs.” Under the influence of subjective and objective factors, the global basic large-scale models are mainly oriented to the To B industry market.

**The commercialization of large models at the To C end is slow. **At present, we are facing problems such as high cost of computing power, and the larger the scale of users, the greater the losses of enterprises. It is also unavoidable to output wrong "noise", and there are even ethical challenges such as information leakage and policy supervision. Even Microsoft only deploys large models in tool products (office office suites, web browsers, photo editing tools such as Photoshop). The essence of Microsoft selling services to tool companies is still To B commercialization.

**It is a pragmatic approach to implement large models for To B-end enterprise customers. **In the industry market, customer needs are vigorous and clear. Around the world, retail, finance, manufacturing, government and other fields are relying on large models for intelligent upgrades. The industry consensus is that a model that has been fine-tuned based on industry knowledge on the basis of a large model will perform better than an unoptimized general-purpose large model.

According to data released by market research firm IDC in May this year, the total size of China's artificial intelligence market in 2022 will be US$12.2 billion, including US$8.13 billion for hardware, US$2.69 billion for software, and US$1.41 billion for services. IDC predicts that in 2026, China's artificial intelligence market will reach US$26.9 billion, including US$14.85 billion for hardware, US$7.69 billion for software, and US$3.89 billion for services. The compound annual growth rates of hardware, software and services are 15.1%, 32.0% and 28.5% respectively.

Fanaticism always returns to reality. In June, "Finance" held a dialogue with Hou Zhenyu, vice president of Baidu Group, and Zhu Yong, vice president of Baidu Smart Cloud, with the theme of "What kind of model does China really need?" Hou Zhenyu and Zhu Yong deeply participated in Baidu Wenxin Qianfan In this conversation, we discussed three major issues: the creation of a model platform and the shaping of a commercial ecology: Is a large model a luxury game? What kind of big model does the enterprise need? Is there a bubble in the big model market?

Interlocutor Profile:

Hou Zhenyu, vice president of Baidu Group (in charge of the cloud computing production research team and basic technology engineering team of Baidu Smart Cloud Business Group)

Zhu Yong, Vice President of Baidu Smart Cloud (in charge of Baidu Smart Cloud Application Product Center)

Host: Xie Lirong, Deputy Editor-in-Chief of Caijing Magazine

The following is a condensed version of the dialogue record:

**Is the big model a luxury game? **

** "Financial" Xie Lirong: China has set off a wave of large-scale entrepreneurship, and the threshold for large-scale models is very high, but the current situation of the Chinese market does not seem to be the case in terms of the speed and scale of entry? **

Zhu Yong: The threshold for large models is relative, and there will be different types of players. The first category is the same as Baidu, making a basic large model from scratch. This has very high requirements for computing power, algorithms, data, and talents.

Taking data as an example, the basic large model requires massive data training, including Internet data, professional field data, news information data, and high-quality professionally labeled data. Taking computing power as an example, a large model with hundreds of billions of parameters such as ChatGPT needs to be trained continuously for 100 days with NVIDIA's most high-end A100/H100 GPU. Algorithms and talents are also key. Engineers have different training methods, just as different chefs cook dishes with different tastes based on the same raw materials. This requires the accumulation of long-term practical experience, so the threshold is very high.

The second category is the large industry model, which requires some fine-tuning and targeted customization based on the capabilities of the basic large model. This is much lower than the cost of data labeling and algorithm fine-tuning from scratch in the past. The third category develops applications based on the first two large models. Baidu, other companies and even some open source platforms provide development tools to lower the threshold for software development.

** "Financial" Xie Lirong: What is the level of China's large-scale models in the global market? **

**Hou Zhenyu:**Personally, I think Chinese large-scale models are still leading in the global market. Large-scale model development and search engine development are actually similar, and both require very deep technical accumulation. From a global perspective, there are only a few countries with independent research and development of search engine technology. At present, China and the United States may be the only two countries that can completely independently develop large-scale model technology.

** "Finance" Xie Lirong: Is there any absolute advancement and backwardness in large models? **

**Hou Zhenyu: **Large models are not absolutely good or bad. Although it may have certain differences in different fields, it is like choosing a smartphone. Some people use Apple, some people use Android, the most suitable is the best. When the large model was first launched, people often asked some tricky questions about it. But in fact, in a truly serious enterprise-level environment, there are very few such scenarios. Enterprises need to choose a large model that is more suitable for them according to their business scenarios. Chinese companies, in particular, need to choose products that have a better understanding of Chinese and are suitable for the characteristics of Chinese companies.

** "Finance" Xie Lirong: How much resources and talents did Baidu invest in the large-scale model? **

**Hou Zhenyu: **AI large-scale model is Baidu's core strategy, which requires continuous and comprehensive high-intensity investment. Taking computing power as an example, the accumulation of the number of GPUs we have accumulated in the past is measured in tens of thousands, which is a huge investment. Baidu has also developed a complete set of tool chains over the years to train models faster and better.

In the past 10 years, Baidu has invested more than 100 billion yuan in AI. As a technology company, Baidu spends more than 20% of its revenue on R&D every year. (Remarks: After 2019, Baidu's core R&D expenditure accounted for more than 20% of revenue for a long time. In 2022, Baidu's R&D expenditure rate was 24%, second only to Huawei's 25% among Chinese technology companies. Baidu core refers to excluding Aiqi Baidu has its own business after the arts), but the big model is not as simple as investing a sum of money to make a model. It requires computing power, data, and experienced AI engineers to accumulate for a long time on a good R&D platform.

**"Finance" Xie Lirong: In addition to money, cards, and data, what are the challenges for a start-up company to make a basic large-scale model? **

**Hou Zhenyu: **Money, cards, and data are very challenging in themselves. Start-up companies make basic large-scale models. In addition to minimum computing power, sufficient and high-quality data, and experienced AI R&D personnel, they also need an AI development platform that can manage models and computing power well. At present, large companies will use these platforms to provide external services in the form of clouds. For example, Baidu Smart Cloud provides external services through the Wenxin Qianfan large-scale model platform. However, the threshold for training a basic large model from scratch is still very high. Because the large model is not enough to be trained, but also requires continuous agile iterations, and large companies will be relatively more mature.

** "Financial" Xie Lirong: Some companies are starting to build their own large models. Is it necessary to build a large model by yourself? When the public cloud was just emerging in 2014, some customers were worried about data security. Do they also worry about this issue when they use large models? **

**Hou Zhenyu:**Every company must use a large-scale model, but does every enterprise need to make a large-scale model by itself? I don't think so. It is very expensive to make a basic large model from scratch by yourself. Enterprises can use their own data to fine-tune other people's basic models, and they can also achieve very good results.

Zhu Yong: I think enterprises should think more about how to use large models and how to make good use of large models. Every business can have its own mockup, but there's no need to start over. Because companies like Baidu have provided a good technical base. You can rely on Baidu to make some customized products, which is a better cost-effective choice for customers. The problem of data security is not a new problem brought about by the emergence of large models. If it is compared to cloud computing, there are public cloud, private cloud, hosting and so on. In the large-scale business model, we have fully considered the corresponding products and solutions.

** "Financial" Xie Lirong: The popularity of smartphones and clouds is due to the low price. When will China's large model enter the stage of general application? **

Hou Zhenyu: The large model itself brings a lot of cost savings. In the past, when enterprises developed AI applications, they needed to do data cleaning, labeling, model training, reasoning, and optimization according to the application scenarios. No matter how small the scene is, the whole process has to be done, and the cost is very high. But based on the large model, there is no need for so much data, time, resources, and manpower in the past. I suggest that enterprises pay attention to and use large-scale model technology as soon as possible, because it can greatly reduce the threshold of AI application.

**What big models do Chinese companies need? **

** "Finance" Xie Lirong: Baidu's Wenxin large-scale model began internal testing in March. During the internal test, can the enterprise clearly put forward its own needs? Where are their needs concentrated? **

**Zhu Yong:**Since the internal test in March, we have successively received access requests from more than 150,000 customers. At the same time, hundreds of partners are conducting research and development tests with us in the scene. This covers different industries such as the Internet, manufacturing, and finance, and many scenarios in it are of high value. To sum up, there are several categories of high-frequency scenarios—knowledge management, content creation (including marketing copywriting, media information), intelligent customer service, code generation, and office efficiency improvement.

** "Financial" Xie Lirong: There is a long-standing problem in the digital transformation market, many customers do not know what they want. In the field of large models, does this contradiction also exist? **

**Zhu Yong: **There are indeed differences between different industries and different customers. After the big model came out, the Internet industry paid close attention to its latest developments. Their technical understanding and product cognition are very advanced, so we can quickly carry out research and development tests together, and make demos and product innovations.

The digital foundation of some traditional industries is relatively weak, so Baidu will have a large number of engineers to co-create with customers, combine AI capabilities with their industry pain points, and produce many very novel product concepts. When AI technology is combined with the industry, it is necessary to understand technology and AI on the one hand, and to understand the industry on the other hand. Therefore, when we connect with customers and partners, we often need both parties to create together.

** "Financial" Xie Lirong: How does Baidu provide large-scale model services to different industries and different types of customers? How to evaluate cost performance from the customer's point of view? **

Zhu Yong: In terms of price, if the enterprise is just trying and is sensitive to price, it can use public cloud services. According to the call volume, Pay-as-you-go (as much as you use) does not require a one-time basis Investment in facilities is also an advantage of the public cloud. Some companies are willing to make large infrastructure investments and build their own intelligent applications. Baidu can provide a complete set of AI models and AI bases, and companies can develop applications based on the AI models and AI bases.

** "Financial" Xie Lirong: How do companies choose a large model that suits them? **

Hou Zhenyu: First, it must be the model effect, which is the basis for choosing a large model. Enterprises need to evaluate the value that large models can play in usage scenarios. Second, focus on iteration speed. It depends not only on whether the basic large model itself has vitality, but also on whether the platform has a complete tool chain, supports convenient secondary development and model retraining, and supports better iteration of large models. Third, the actual landing cost and delivery form of the large model. Enterprises can choose the delivery mode of public cloud and private cloud according to their needs.

** "Financial" Xie Lirong: Wenxin Qianfan is positioned as a one-stop enterprise-level large-scale model platform, how to understand "one-stop" and "enterprise level"? **

Hou Zhenyu: First of all, "one-stop", AI is a technology driven by data. From the beginning of its birth, AI needs to collect, clean up, and label data, and then conduct training based on existing models. After training, it needs to manage fine-tuned data and model versions, and finally put them into business use. It's a whole process. Baidu provides these capabilities, and it is very easy to use, which can meet the needs of customers in the whole life cycle of AI research and development to application.

Besides "enterprise level", enterprise level applications are not personal applications, not as simple as uploading photos. Enterprise-level applications will be more refined and complex, and factors such as scale, scalability, implementation costs, and stability and robustness need to be considered.

** "Financial" Xie Lirong: According to Baidu, the Wenxin Qianfan large-scale model platform has six characteristics - easy to use, safe, comprehensive, efficient, open, and integrated. Why should ease of use come first? Is it true that only useful technologies will be popularized? **

**Hou Zhenyu: **Ease of use is very important. The natural language large model can provide customers with an easier-to-use interface, which is convenient for everyone to interact with the machine. "Integration of cloud and intelligence, AI inclusiveness" is the strategy of Baidu Smart Cloud, and "AI inclusiveness" has always been one of our ideals. AI cannot be just a technology high in the ivory tower. It is necessary to lower the threshold of AI use, including the threshold of data usage, resource usage, and human use of AI, so ease of use is very important.

**"Finance" Xie Lirong: In the past three months, the public has been well-popularized with artificial intelligence models. For thousands of industries, has the commercial opportunity for large models come? What should a good business rhythm look like? **

**Zhu Yong: **The large model of artificial intelligence has very clear changes in the R&D and application paradigms. The sooner you embrace and understand the big model, the more it will have an impact on the business. This is not a yes or no question. When it comes to pacing, different companies embrace big models in different ways. Some enterprises can start from a single-point application trial and use the public cloud to call services, so that they can quickly verify and do demo development at a lower cost.

On the other hand, no matter it is a large or small enterprise, it is necessary to cultivate AI native thinking. For example, some applications can be transformed and upgraded in a gradual manner. Another approach is called refactoring. According to Baidu's internal statement, all products in the future will be re-made based on the large model.

** Is there a bubble in the big model market? **

** "Financial" Xie Lirong: Do you really need so many big models in the business-oriented B-end market? **

**Hou Zhenyu:**My personal opinion is that the basic large model does not need so many. Of course, this is just looking forward from the end. But early in the development of any industry, the market becomes prosperous and frothy. From the perspective of industrial development, we should allow some bubbles now. We should also face up to this. But I still believe that after the big waves have washed away the sand, it is still a few companies that ultimately provide basic model services.

**Zhu Yong: **In the direction of the basic large model, although there are many players now, it is really difficult to maintain rapid iteration, continuously develop a more comprehensive and complete tool chain, and continuously improve product capabilities based on customer feedback things. Therefore, although the large model may be very hot now, it is a long-distance race. In the end, it will be like today's cloud computing landscape, and the market will gradually converge.

** "Financial" Xie Lirong: Many companies that make server hardware also want to make large-scale industry models. Baidu used to be their customer, but now they are competing with each other. How should we live together peacefully? **

Hou Zhenyu: I don’t think we can directly talk about competition. We are still a cooperative relationship first. The two parties will indeed have similar services and face similar industries at the same time, but we and traditional hardware manufacturers are more complementary. Baidu is an AI company with Internet genes. It has accumulated a large amount of general-purpose data and a general-purpose large model. Its advantages lie in AI, software, technology and other fields. Traditional hardware manufacturers have accumulated industry data and developed Know-how in vertical fields such as traditional government and enterprise industries. The two sides have different strengths in building large models. Enterprises such as Baidu and H3C are not only partners in purchasing servers and switches, but also jointly build large models.

** "Financial" Xie Lirong: Baidu usually pays attention to the progress of competitors' large models? **

Zhu Yong: First, technology and overall effect. Second, supporting tools. Third, the business model. If you go back three or four years ago, the artificial intelligence market was still relatively far away, but today deep learning technology, product commercialization, investment, and open source ecology are all accelerating.

** "Financial" Xie Lirong: In the next few years, will large-scale models be the key direction of Baidu's core? Why? **

**Hou Zhenyu: **Large models will be the core focus of Baidu. Baidu is an AI company, and large models are an important development direction of AI. Whether it is on the To C side or on the To B side, it will bring huge changes to Baidu's products and services. For Baidu, large models are very exciting, which is both an opportunity and a challenge. Baidu will continue to invest in large models. I believe that large models will accelerate cloud computing into the AI era and reshape the cloud computing landscape. The status of MaaS (Model as a Service) will become more and more important, and it will also accelerate the realization of Baidu The "integration of cloud and intelligence" strategy and the ideal of "AI inclusiveness" proposed by Smart Cloud.

** "Financial" Xie Lirong: The last round of commercialization of artificial intelligence that started in 2016 had some problems, and AI companies had to do a lot of tedious and detailed customization projects. How can large models avoid the problems encountered in the last round of artificial intelligence commercialization? **

Hou Zhenyu: This round of large-scale model industry landing is different from the AI industry represented by deep learning ten years ago. This is a new paradigm for AI research and development, which is different from previous investments. Before the emergence of large-scale models, AI was most criticized and the most difficult to implement was that the actual industrial environment was fragmented. For example, the face recognition of gates and the face recognition of payment are different. Because the light and environment are different, it needs to be oriented to different applications, and the training should be done from scratch according to the data accumulated by customers, and then adapted to the scene. This kind of customized delivery is very cumbersome.

But under the basic large model, very good results can be obtained without too much fine-tuning data and without too many rounds of training. Base large models solve many scenarios much easier than before. The generalization ability of large models is much stronger than before. This is different from the last round of AI landing. Last year, a model with 1 billion parameters was called a large model, but now the model parameters are often hundreds of billions. With more than 100 billion parameters, intelligence will emerge, stronger generalization capabilities, and general capabilities in various scenarios.

** "Financial" Xie Lirong: When many people pour into an industry, bubbles may be inevitable. If the large model is to be developed in a healthy way, what suggestions do you have? **

Hou Zhenyu: My advice to large model practitioners is to do what you can. You don't have to do it all by yourself. Instead, consider the commercialization of AI and find the scenarios and chains that best suit your capabilities. We hope that when the industry develops rapidly in the early stage, certain bubbles are allowed. However, policy can reach a consensus on the supervision of technology application and the industry's standards for evaluating the quality of technology. There are standards to follow and rules to follow, so that we can develop in a healthy way.

Zhu Yong: We also need to change our way of thinking. The big model is a watershed technology, a subversive technology. Keep an open mind and keep learning.

View Original
The content is for reference only, not a solicitation or offer. No investment, tax, or legal advice provided. See Disclaimer for more risks disclosure.
  • Reward
  • Comment
  • Share
Comment
0/400
No comments