Your Position Home AI Technology

Development history of domestic large models: Large model competition enters the “post-violent computing era”

Article source: Slow playback

Image source: Generated by AIImage source: Generated by AI

The AI arena is quietly witnessing a profound transfer of technological power.

This change triggered by DeepSeek has not yet subsided. The competition for big models has entered the “post-violent computing era”. The importance of efficiency has become apparent on the paper. AI power is also facing restructuring. OpenAI’s “one-family dominance” situation is constantly being impacted.

The back waves have evolved fiercely, while the front waves have cut through thorns. The winner of the “Changing King Flag at the Top of the City” has not yet been decided. How to obtain ecological support through open source and use closed source to achieve commercial realization is the key to victory.

01. China’s AI projects “blow out” by taking advantage of policy trends

The development of domestic AI is quietly brewing. 2023 is regarded by industry insiders as a watershed in the development of artificial intelligence.

Artificial intelligence scientist Li Feifei once said: “Historically, 2023 is expected to be remembered by people due to profound changes in technology and public awakening.”

Prior to this, technical explorations and innovations on artificial intelligence have long been numerous.

In 1956, John McCarthy first proposed the concept of “Artificial Intelligence” at the Dartmouth Conference, and AI was officially born as a discipline.

But in 1973, due to bottlenecks in AI research, investment in AI shrank significantly, and development entered a “cold winter”.

In 1986, until Geoffrey Hinton, the “godfather of AI”, proposed the Backpropagation algorithm, the revival of neural networks brought the development of AI to the dawn. In 2017, Google proposed a self-attention mechanism to replace RNN/LSTM and become the core architecture of the subsequent Big Language Model (LLM)…

Looking back at the development process of AI in China, 2023 is also the “first year of the beginning of the domestic AI era.”

According to Sky Eye, in the first half of 2023 alone, there were more than 20 financing events directly related to large models, and the number of large models of various types that have been released in China exceeded 100. By July 2024, the number of generated AI large models will be completed and launched. The number is close to 200.

To this day, there are still only a dozen companies that have the opportunity to advance to the finals. Consulting firm Frost Sullivan pointed out that our current competitors in the field of general basic models have shrunk to more than 20, mainly dominated by Internet companies, cloud computing giants and artificial intelligence startups.

Everyone is a witness to this “war” without smoke. Standing back at the beginning of 2025, perhaps it was after experiencing the big waves of the “Hundred Models War” in 2024 that DeepSeek was able to start the year in 2025. The “thunder” of the industry has promoted domestic AI to achieve a critical leap and gain a firm foothold.

Companies with continuous innovation capabilities are gradually dominating the market. From graphics to video to multi-language advertising generation, the application scope of artificial intelligence is rapidly expanding.

At the same time, large model and agent technology have also entered a stage of accelerated development. Whether it is user experience optimization on the C-side or enterprise solutions on the B-side, agents and large models are redefining the way technology connects with society.

There are currently three forces in the finals: first, Alibaba and Internet giants and cloud service providers represented by ByteDance are involved in the big model; second, the national artificial intelligence team represented by iFlytek of USTC uses G/B/C linkage to make both solutions and hardware products; third, AI startups such as Smart Spectrum and DeepSeek, and a few are still insisting on basic model innovation.

The upstream and downstream situations of the industrial chain are divided, and the development paths of model manufacturers are divided. Even the “AI Six Tigers” face road differentiation. For example, Baichuan Intelligence has turned to large models in industries such as medical care; One Everything has handed over super large model training to Ali; Dark Side of the Moon and MiniMax focus on C-end applications and products.

Industry insiders generally believe that compared with upstream and downstream of the industrial chain, model manufacturers in the middle reaches generally face profit difficulties. In 2025, the number of players in the big model finals and the number of companies that can innovate at the basic big model level will be further reduced.

02. From “burning money belief” to “efficiency revolution”

If “cost, AI Agent, and multimodal” are the three keywords in the current AI industry and represent the evolution direction of the big model in 2024, then they may also represent the key node for the big model to move towards the industrial implementation.

First of all, cost is undoubtedly the key to determining the life and death of an enterprise. The huge demand for computing resources for training and deploying large-scale AI models cannot be ignored, which also makes enterprises have to bear high computing costs and operation and maintenance costs.

DeepSeek-R1 also grasps the pain points of enterprises in efficiency and cost control, and achieves performance comparable to or even surpassing the head model with relatively low computing power investment.

Traditional artificial intelligence development models often rely on the logic of “scale first” and pursue ultra-large-scale models and ultra-large computing power clusters. DeepSeek R1 ‘s lightweight model and open source strategy have lowered the threshold for artificial intelligence applications and promoted the popularity of mid-range computing power facilities and distributed data centers.

Nvidia, which is upstream of the industrial chain, has begun to face certain pressure from adjusting its demand structure due to the emergence of DeepSeek.

ASIC chip manufacturers have ushered in new development opportunities. Since ASIC chips can perform hardware acceleration for specific artificial intelligence applications, they have obvious advantages in energy efficiency ratio and cost control, and are more suitable for the development trend of distributed computing power.

For computing power servers, regional data centers have begun to undertake delay-sensitive application needs such as intelligent quality inspection and financial risk control by virtue of their low latency and proximity to application scenarios.

Cloud computing giants such as AWS and Alibaba Cloud have adjusted the construction strategies of some large data centers and increased investment in edge computing and distributed computing power layout.

The application side will benefit from the decline in computing power costs, driving the accelerated penetration of artificial intelligence in manufacturing, finance, medical and other fields.

On the code hosting platform GitHub, a large number of awesome deepseek integrations based on the DeepSeek model have emerged, forming a positive cycle of “demand driving supply” and realizing the two-way empowerment of “computing power + industry”.

Artificial intelligence technology will accelerate its penetration into all walks of life and become an important engine for promoting industrial upgrading and economic development.

However, it is worth noting that the technological breakthrough of DeepSeek R1 may also trigger the “Jevons Paradox” while lowering the threshold for artificial intelligence applications.

The Jevons Paradox was proposed by the 19th-century economist William Stanley Jevons, who found that as coal use increased efficiently, the total consumption of coal increased. This paradox reveals a profound economic law: an increase in efficiency does not necessarily lead to a reduction in resource consumption. Instead, it may stimulate demand growth due to reduced costs and expanded application scope, and ultimately lead to an increase in total resource consumption.

Microsoft CEO Satya Nadella hit the nail on the head by citing Jevons ‘paradox to explain the possible impact of DeepSeek R1.

Nadella believes that more affordable and accessible artificial intelligence technology will lead to a surge in demand through faster popularization and wider application.As the threshold for artificial intelligence technology lowers, a large number of new application requirements will emerge in areas where artificial intelligence could not be applied due to cost constraints in the past, such as small and medium-sized enterprises, edge computing scenarios, etc., resulting in an exponential increase in computing power call density.

The explosion of emerging application scenarios will also accelerate the splitting of computing power demand. The demand for real-time computing power in cutting-edge fields such as intelligent driving and embodied robots is extremely huge, far exceeding the speed of optimization by DeepSeek technology. Even if the efficiency of a single task is increased several times, the concurrency requirements of million-level smart terminals will still form a huge computing power consuming black hole.

03. Collaboration between “open source” and “closed source”

With the popularity of DeepSeek, a big open source model, keywords such as “open source” and “free” appear frequently.

If before DeepSeek, domestic big model companies still had many differences on the paths of “open source” and “closed source”, now the calls for “open source”,”open ecology”, and expanding the circle of friends seem to have become the mainstream.

Under the impact of DeepSeek, a catfish, domestic large model companies have shown a more “open” attitude, hoping to accelerate the establishment of their own developer ecosystem and application ecosystem.

The key differences between the open source model and the closed source model can be observed from three dimensions: basic conditions, technical level and commercialization.

From the perspective of basic conditions, the open source model uses public data sets and community contribution data as data sources, and distributed, developer-owned GPU clusters as computing power support, providing equal access opportunities for developers, researchers, enterprises, etc., promoting technological innovation and sharing.

Closed-source models are developed by companies or teams, and use proprietary data such as user behavior logs, private databases, and public data after cleaning as data sources. Users can only use these models based on the interface or platform provided by the company.

From the perspective of profit scenarios, open source models themselves do not directly bring benefits, but they usually achieve profit through additional services (cloud computing, technical support, training, customized development, etc.). Companies can provide value-added services through commercial means and rely on open source models to form a sustainable source of revenue.

The commercialization path of the closed-source model is relatively direct, and enterprises realize profits through licensing, subscription services, platform fees, etc. The closed-source model can bring high profits to the company because customers need to pay for their rights and services.

Open source and closed source are not “incompatible”. They are likely to form a form of interaction between open source and closed source in the future. Open source accelerates the popularization and innovation of AI technology, while closed source ensures that technology can achieve rapid commercial development and maintain stability.

The winners in the future will be generalists who can master both open source and closed source capabilities, not only gaining ecological potential through open source, but also using closed source to achieve value capture.

As Nadella said,”There will be no winner-take-all situation in ultra-large AI, and the open source model will check and balance the source.”

end

DeepSeek will play an important role in the current AI era, just as Android plays in the mobile Internet revolution.

Reconstructing the industrial ecology will trigger a chain reaction and accelerate the development of upper-level applications and the unification of lower-level systems. This will mobilize ecological forces across software and hardware and upstream and downstream, encourage all parties to increase investment in collaborative optimization and vertical access of “model-chip-system”, further weaken CUDA’s ecological advantages, and create opportunities for the development of the domestic AI industry.

Through technological innovation, DeepSeek has reduced its reliance on high-end imported chips during the AI model training process. This demonstrates a feasible technical path for domestic companies and greatly enhances the confidence of domestic companies in independently developing computing power chips.

The game is not only an open source and closed-source technology choice, but also a competition involving the right to speak, market dominance and the allocation of computing power in AI development. And this battle for AI rights has already begun.

Popular Articles