DeepSeek’s east wind blows through China’s AI industry chain

At present, more than a dozen domestic chips have announced that they have completed the adaptation and launch of the DeepSeek model, including Muxi, Tiantian Zhixin, Moore Thread, Biren Technology, etc.

DeepSeek’s east wind blows through China’s AI industry chain插图

Photo source: Visual China

Blue Whale News, February 10 (Reporter Zhu Junxi)The heavy stone dropped by DeepSeek stirred up thousands of waves, not only being pushed into the spotlight of the world, but also ushering in many unexpected opportunities.

Less than a month after its launch, DeepSeek application has become the fastest-growing AI application in the world, with the number of daily active users showing a steep growth curve. According to statistics from the AI product list, as of January 31, the global daily activity of DeepSeek apps (APP) has exceeded 20 million, exceeding the number of bean buns with ByteDance and accounting for 41.6% of ChatGPT.

But when users want to have frequent and in-depth conversations with DeepSeek, they often get stuck.“The server is busy. Please try again later.Some users joked that DeepSeek named the model R1 because it can only run once a day.

On February 6, DeepSeek also stated that due to current server resource constraints, API service recharge had been suspended. As of press time, the recharge service has not been restored. Some AI practitioners told Blue Whale News that the team originally built AI search functions based on the DeepSeek model. However, after DeepSeek exploded, the API service became stuck and the response timed out, resulting in the inability to generate search results. They had to work overtime during the Spring Festival to transfer services to alternate GPT-4o models.

middle and upper reaches“Operation Beach Rush

DeepSeek ushered in AI“At the moment of breaking the circle, there are countless business opportunities for the upstream and downstream industry chains. Cloud manufacturers and chip manufacturers with a lot of computing power have begun to act quickly.

The first to take action were domestic and foreign cloud manufacturers. Cloud giants such as Microsoft and Amazon connected the DeepSeek-R1 model to their cloud platforms at the beginning of the Spring Festival.Since February 1, major domestic cloud vendors such as Huawei Cloud, Alibaba Cloud, Baidu Smart Cloud, Byte Volcano Engine, and Tencent Cloud have also announced the launch of the DeepSeek model to provide model deployment services to developers and corporate customers.

Closely followed by major domestic chip manufacturers.At present, more than a dozen domestic chips have announced that they have completed the adaptation and launch of the DeepSeek model, including Muxi, Tiantian Zhixin, Moore Thread, Biren Technology, etc.These chip manufacturers either rely on their own computing power platforms or join forces with downstream AI Infra platforms to support the deployment of the DeepSeek model.

A practitioner explained to Blue Whale News thatCloud vendors ‘agile response is due to their lower cost of accessing DeepSeek.The DeepSeek model is trained based on NVIDIA GPUs, and cloud vendors usually have a large number of such chips, which can be deployed directly and quickly. Domestic chip manufacturers use different instruction sets on hardware, so they need to do additional adaptation and transplantation work, and the corresponding workload and cost will be greater.

Both cloud manufacturers and chip manufacturers hope to catch up with this round of DeepSeek’s popularity. When DeepSeek’s official API service is unstable, it can attract some users to jump to its own platform and provide users with DeepSeek model services based on existing computing power resources. After initial experience, some users said that the price and reasoning speed of some platforms can meet their needs, and will later consider developing DeepSeek-R1-based AI applications through third-party platforms.

Many promotion messages for third-party platforms have also emerged on social platforms, claiming that they can bypass the congestion on DeepSeek’s official website and provide a smooth and stable experience. Some of these platforms have also highlighted“The signboard of domestic chips + domestic large models. For example, Silicon-based Mobile teamed up with Huawei’s cloud team to launch a DeepSeek model based on Huawei’s Cloud Shengteng Cloud service on its large-model cloud service platform. Huawei also integrated DeepSeek-R1 in the Pure Blood Hongmeng version of Xiaoyi Assistant App.

Yuan Jinhui, founder and CEO of Silicon-Based Mobility, revealed on social platforms that before the release of the DeepSeek-V3 model, DeepSeek founder Liang Wenfeng had suggested that at least 20 Nvidia H800 servers could be deployed on its platform. Considering the cost, they had no choice.

After DeepSeek became popular, the silicon-based mobility team decided to use domestic chips for adaptation. So he reached a cooperation with Huawei, and during the Spring Festival holiday“Work overtime, discuss problems anytime, and hold meetings until late at night. Finally, on February 1, the DeepSeek model service based on domestic chips was officially launched.

A good opportunity for domestic computing power

When talking about how the DeepSeek model is matched with domestic chips, we must first distinguish between the training and reasoning stages of the large model. During the training stage, the large model is still in the learning process, and requires inputting a large amount of data and constantly adjusting internal parameters to discover rules. Reasoning is the process of practical application after the large model is trained.

A former AI engineer at a major factory further explained to Blue Whale News thatModels have higher requirements for computing power and bandwidth during the training phase. At the same time, large model manufacturers need to test different model structures and operators, and most of them will give preference to NVIDIA’s GPU hardware and its“Development toolkit CUDA.The reasoning stage has low requirements on software and hardware, so it has become the main scenario for many domestic chips, compatible and optimized for trained models.

A domestic chip manufacturer told Blue Whale News thatAlthough DeepSeek has micro-innovations in structure, it is still a big language model.The adaptation to DeepSeek is all in the inference application stage, so it is not difficult and can be implemented quickly.

After DeepSeek sparked heated discussions over low costs, it once caused Nvidia’s share price to plummet, and its market value evaporated in a single day, setting a record for U.S. stocks. A widely circulated claim is that DeepSeek circumvented Nvidia’s CUDA framework during model development, thereby reducing its reliance on Nvidia. The source is DeepSeek mentioned in the V3 model technical report,“We specifically use custom PTX (Parallel Thread Execution) instructions and automatically tune the communication block size, which significantly reduces the use of L2 cache and interference with other SMs.& rdquo;

Does using the PTX programming language mean DeepSeek has stepped beyond NVIDIA’s CUDA monopoly?Some practitioners say this statement is completely wrong because PTX is part of CUDA and does not circumvent CUDA.

The practitioner explained that CUDA is a software suite that includes an upper-level development language, a rich API tool library, compilation tools, etc., and is provided to developers to program GPUs. PTX is CUDA’s mid-level assembly language, closer to the hardware level, and is usually not directly targeted to developers. CUDA-based development is more advanced, making it difficult to perform more refined control on the GPU. Using PTX, a lower-level programming language, can more flexibly control the underlying hardware, optimize program performance,“This is one of the innovations that DeepSeek requires less computing power.& rdquo;

Although the DeepSeek model is still trained based on NVIDIA GPUs, both its efficient utilization of computing resources and the resulting wave of domestic chip adaptation are major benefits to the chip industry.

Some practitioners said that before, large domestic model companies would also use domestic chips to do some model reasoning or test training work, but the scale was limited and did not reach the level this time. Driven by DeepSeek, the utilization rate of domestic chips will be greatly improved.

Is the year of AI application implementation really here?

The waves caused by the middle and upper reaches will eventually be transmitted downstream. As the DeepSeek craze spreads, the AI application layer has also begun to take action on a large scale. In the past few days, smart hardware, automobiles, finance and other industries have been actively accessing the DeepSeek model, hoping to use its capabilities to upgrade their services.

Last week, Reading Group announced its writer-assisted creative products“Writer’s assistant has integrated the DeepSeek-R1 model, calling it the“DeepSeek is used for the first time in the field of online literature.The Reading Group told Blue Whale News that when calling the smart question and answer function that helps writers check information and find inspiration, DeepSeek has a strong ability to understand and deduce the writer’s questioning intentions and can understand the subtext and implication.

At the same time, the ultra-long thought chain displayed by the R1 model is also highly enlightening for online writers.“Online writers, especially mature writers, often complain about the stereotyped repetition of AI content. What they need is inspiration and thinking.” rdquo; Reading the text. After connecting DeepSeek, when a writer asks AI to produce an outline of an online novel that contains hot elements on a certain website, in addition to providing the generated answers, the AI will also clearly list the specific elements during the thinking process and provide corresponding hot bibliography, thereby assisting the writer Get the professional content he needs.

Under competitive pressure from DeepSeek, OpenAI announced last week that it would also make public the thought chain of its latest model, the o3-mini series. But its researchers said that although these thinking summaries are very close, they are not original thinking chains. Some developers have previously analyzed Blue Whale News and said that OpenAI’s move may be due to various considerations such as user experience, privacy protection, output quality, technical cost and trade secrets. It can not only provide a useful thinking process but will not bring negative impact.

In May last year, DeepSeek detonated a domestic model due to low pricing“Price war.It is generally believed in the industry that price reductions for large models will help promote the implementation of applications. As for the two models released one after another by DeepSeek, although the price preferential period for the V3 model ended on February 9, the API call price is still one-tenth of that of GPT-4o. The pricing of the inference model DeepSeek-R1 is also 27-55 times lower than the official version of the target o1.

Silicon-based Intelligence is a company that focuses on AI digital people, silicon-based smart screens and other services. Its founder, chairman and CEO Sima Huapeng told Blue Whale News that the cost of large model bases has been reduced, and the cost of AI infrastructure construction has dropped., is a very big promotion to the development of the industry. There will be a big explosion in AI applications, and more super applications will emerge.& rdquo;

DeepSeek’s open source and thought chain disclosure of models allows silicon-based intelligence to see the possibility of upgrading its AI digital human capabilities and services. During the Spring Festival, the team responded quickly and connected the DeepSeek model to improve the capabilities of the silicon-based intelligent digital person series products in natural language understanding and emotion recognition.

On February 10, silicon-based intelligence teamed up with computing power company Huakun Zhenyu to release a new solution. Integrating the silicon-based intelligent self-developed AI digital human engine, and relying on Kunpeng and Shengteng clusters as domestic high-performance computing bases, the DeepSeek large model has excellent response speed and stability under massive data processing.

For the domestic AI industry chain, this Spring Festival of the Year of the Snake is bound to be difficult. The ripples caused by DeepSeek may have to wait some more time to gather into a larger wave.

Wu Jingjing also contributed to this article

Related articles

“Sideline” is becoming popular, and coffee and tea shops are selling boxed meals

The rumor fell! Haier Kataichi controls US$1.8 billion in Auto Home, accelerating the ecological layout of the automobile industry

AI replaces KOL? Brands such as Anke and Amusi are “endorsed by AI”, but experts say they are risky and consume user trust.

Popular Articles

1Germany’s Choice Party supports deregulation of Bitcoin and calls for disengagement from the euro zone

2DeepSeek overturned the “AI table”, and three major turning points determine the future of the big model

3Li Feifei’s team spent 146 yuan to reproduce the AI model, achieving performance comparable to DeepSeek.

4DeepSeek detonates reading stock price,”AI+IP” once again hits the entertainment industry

5DeepSeek may consider financing at a multi-billion-dollar valuation, and Alibaba’s share price immediately rose more than 6%