Wen| Industrialist, author| Doudou, Editor| Peet
From the current point of view, the emergence of DeepSeek has broken the inherent computing power and constraints in some links in the model, but there are still many problems to be solved, such as the directional distillation of the model, such as the construction of the data system, and also such as the cross-cooperation of interests of all parties in the ecosystem. This has long been not only a technical proposition, but also an industry-oriented industrial proposition. nbsp;
However, what is certain is that the industrial tide of China’s AI model will inevitably surge and be unstoppable in 2025.
The emergence of DeepSeek seems to gradually outline a definite future blueprint for the implementation of AI.
In the past few years, the entry threshold for large AI models has been clearly calibrated with trillions of parameters, super computing power support and massive and high-quality data resources, which all mean high entry prices.
During the Spring Festival of 2025, DeepSeek is like a dark horse, strongly breaking the original rules of Chinese and foreign AI model arenas.
The team, which originally came from a quantitative organization, significantly reduced the parameters of the large model to 1/10 of the original. With the help of intensive learning and model distillation techniques, a small model was able to outperform GPT -4o in solving mathematical problems. Not only that, DeepSeek also open-source code and open APIs, demonstrating its powerful capabilities compared to OpenAI at an ultra-low price, which made domestic and foreign netizens marvel at this mysterious Eastern power.
On a certain level, these performances above the water have certainly shocked the AI industry, but several issues that should be considered on an extended timeline are just a few core propositions swirling over the AI industry in 2024:That is, how far is the industrial model away from us? At the three-level nodes of data, computing power, and models, and on AI applications that almost reached consensus last year, what impact will the phenomenal event DeepSeek have?
In 2025, the curtain on industrial digital intelligence has quietly begun.
1. The technological paradigm has changed,The model enters a moment of low price and good material
In the process of implementing traditional AI models, there are many problems that limit their widespread application. Among them, burning money without seeing hope ranks first.
Taking GPT – 4 as an example, its training data volume is as high as 13 trillion tokens, covering all areas of the Internet. Such massive data annotation work is not only costly, but also time-consuming and laborious. At the same time, its demand for computing power is also extremely huge, relying on tens of thousands of A100 GPU clusters, and the single training cost exceeds US$100 million.This high cost and resource demand makes it difficult to implement its technology.
This is why DeepSeek is highly respected.That is, it can self-evolve through pure reinforcement learning (RL), giving it significant advantages in data preparation.
In other words, it does not require annotation of the data, which greatly reduces the cost and difficulty of data preparation, saves developers a lot of time and energy, and allows them to focus more on model training and optimization.
At the same time, DeepSeek’s reward design is minimalist, using only the correctness of the answers and format specifications as reward signals. This concise reward mechanism avoids the risk of cheating that may be caused by complex reward models, making model training more efficient and stable.
This minimalist reward design can also better guide the model in the right direction and improve the training effect of the model. Avoid some unexpected situations that may lead to deviations in model training.
In addition, DeepSeek uses the GRPO algorithm and replaces the traditional Critic model with group scores,Computing power consumption is reduced by more than 30%, further reducing the need for hardware resources, commonly known as the dependence on cards.
It is worth noting that its model capabilities have not been greatly compromised by the reduction of computing power.
In a paper published by DeepSeek, a set of data showed that DeepSeek-R1 achieved a Pass@1 score of 79.8% in the AIME 2024 test, slightly lower than OpenAI-o1 -1217. On the MATH-500, its score reached 97.3%, which is comparable to OpenAI-o1 -1217 and significantly better than other models.
In DeepSeek, the outside world seems to be discovering that computing power and parameters no longer seem to be the entry threshold for AI. Or to put it more accurately, in DeepSeek, the outside world has seen a low threshold and low-cost approach that is more suitable for AI implementation., it is better to implement from the cost side.
From the industry perspective, the best beneficiaries of this change are medium and large manufacturers. In the past two years, whether it is large state-owned enterprises, universities, or people’s livelihood departments, etc., projects based on large models have been publicly tendered to the market. A large part of the projects involved are pre-training projects. The unit prices of these projects often exceed tens of millions or even hundreds of millions, as targeted investment by enterprises.
However, after DeepSeek, it can be predicted that the targets of this year’s medium-and large-scale model projects will change significantly. For medium-and large-scale enterprises and even central state-owned enterprises,It can deploy large model projects at a lower cost, or shift more focus to data governance to further improve the final model effect.
Also benefiting are small technology companies, which in the past may have been unable to enter the AI field due to financial and technical constraints. But the emergence of DeepSeek provides them with the possibility that companies can take advantage of relatively low costs andBased on DeepSeek, we develop AI applications that suit our business needs to promote the development and innovation of the company’s business.
Overall,With the transformation of the reinforcement learning (RL) technology paradigm, not only will the threshold and implementation costs of large AI models be reduced, but more companies and developers will also be provided with opportunities to participate in AI innovation.This not only helps promote the development of AI technology, but also provides new impetus for the transformation and upgrading of digital intelligence in various industries.
2. Open source acceleration:The era of small vertical models has arrived
In the paper published by DeepSeek, in addition to the changes in the RL technology paradigm, there is another highlight, which is the construction of a cross-dimensional knowledge distillation system.
A set of data shows that DeepSeek-R1-Distill-Qwen-7B surpassed the original QwQ-32B-Preview with a score of 55.5% in the AIME 2024 evaluation, and its performance improved by 23% while reducing the parameter size by 81%. Its 32B version achieved an astonishing accuracy rate of 94.3% in the MATH-500 test, which is nearly 40 percentage points higher than traditional training methods.
By deconstructing the reasoning logic of the 32B big model into a transferable cognitive model, and then injecting it into the 7B small model through a dynamic weight allocation mechanism, it realizes the transfer of “thinking paradigm” rather than simply “knowledge memory.”
Under this technical path, the small model not only inherits the problem solving capabilities of the large model, but also learns the meta-ability of problem disassembly and logical deduction. This also means that the inference patterns of large-scale models can be distilled into the model, and its performance is better than the results of direct intensive training on the model.
In the field of artificial intelligence, the perception that “the bigger the model, the stronger the performance” has long dominated. The evolutionary trajectory from GPT-3 to GPT-4 seems to confirm the law that “parameter size determines model capabilities.”
With the emergence of this “distillation + reinforcement learning” compound training method, the era of small models seems to have finally arrived.
You should know that for many enterprises, especially small and medium-sized enterprises and vertical sector professional enterprises, when pursuing model performance, they are often limited by the huge computing resource costs required for large models.
After DeepSeek demonstrated that the model can also play a big role, these companies can reduce costs in purchasing and leasing hardware equipment (such as performance servers, GPUs, etc.) and reduce energy consumption costs.
For example, a company focusing on medical image analysis might have to build an expensive computing cluster if it wanted to use a large-scale model to process image data. Now, with the optimized large-scale model, it can complete tasks on ordinary computing equipment., greatly reducing costs.
Among them, with the trend towards model effectiveness, industry-aware companies often have a deep understanding of their own business processes and data characteristics, and they are often able to integrate models into existing business systems more quickly.
In a highly competitive market, this advantage can just enable some companies to achieve rapid overtaking in the AI field and become the makers and leaders of AI rules for vertical tracks.
3. Efficiency and scenario breakthroughs,The explosion period for end-to-side applications is here
As we all know, in practical applications, especially in scenarios such as edge computing and real-time decision-making, traditional AI models often face many limitations.
In edge computing scenarios, due to limited device resources, such as mobile phones, glasses, etc., it is difficult to run large AI models, thus limiting the application of AI technology in these fields.
In addition, in real-time decision-making scenarios, such as financial transactions and industrial production, the reasoning speed and accuracy of traditional AI models are often difficult to meet needs.
The emergence of DeepSeek gave a new idea.Its breakthroughs in model compression, reasoning efficiency and training cost optimization have provided strong support for its implementation in multiple scenarios, bringing huge breakthroughs in efficiency and scenarios.
DeepSeek uses model compression technology to make its optimized models better adapt to resource-limited devices, such as edge computing devices such as smart glasses. This enables edge computing devices to have stronger AI capabilities and provide users with a more convenient and intelligent experience.
For example, in smart glasses, DeepSeek can achieve faster and more accurate image recognition and voice interaction functions. Users can use smart glasses to obtain information, navigate, identify objects, etc. more efficiently, greatly improving the practicality of smart glasses and application scenarios.
In terms of real-time decision-making scenarios, its efficient reasoning capabilities also play an important role.
Taking financial transactions as an example, financial institutions need to analyze and process a large amount of market data in a very short period of time to make accurate investment decisions. It can quickly analyze and predict data, provide real-time decision support for financial transactions, and help financial institutions improve transaction efficiency and profitability.
In industrial production, real-time quality inspection and fault diagnosis are also crucial. It can also quickly analyze data during the production process and discover quality problems and equipment failures in a timely manner, thereby improving production efficiency and product quality and reducing production costs.
It can be said thatIn 2025, the emergence of DeepSeek may cause a new round of terminal application explosion, providing strong technical support for digital transformation and upgrades in various industries. DeepSeek’s application breakthroughs in multiple scenarios not only demonstrate its technical advantages, but also provide new solutions for digital transformation and upgrades in various industries.
4. Ecological transformation:Large factory refining models, small and medium-sized factories for application
DeepSeek also brings changes in the AI ecosystem, and this change will also bring more possibilities to the AI implementation industry.
One fact is that the current AI industry presents a pyramid structure. Giants such as OpenAI and Google control the basic model. Middle-level enterprises rely on API calls and fall into hollow data. Low-level small and medium-sized developers lack customization capabilities and become ecological vassals.
DeepSeek open source core model and open API customization capabilities,This move breaks the pyramid ecosystem dominated by giants such as OpenAI.
Under the new ecological model, large factories can focus on refining models and use their strong technical strength and resource advantages to continuously optimize and improve the performance and capabilities of models.
For example, platforms such as Alibaba Cloud and Tencent Cloud can become model supermarkets, providing hundreds of small models in vertical fields to meet the needs of different industries and users. These major manufacturers can launch more advanced model architectures and algorithms through continuous research and development and innovation to promote the development and progress of AI technology.
Small and medium-sized factories, on the other hand, can focus on applications and quickly develop dedicated AI tools based on open source models without relying on giants to provide black box capabilities. This provides more development space and opportunities for small and medium-sized factories, allowing them to give full play to their flexibility and innovative capabilities and develop AI applications that are closer to user needs and industry characteristics.
For example, some small and medium-sized factories can respond to certain needs such as industrial quality inspection and supply chain forecasting, and use APIs to fine-tune models on demand to develop efficient and accurate AI applications to provide users with customized solutions. This ecological change has also brought many benefits such as technological democratization, ecological positive circulation, and scenario customization.
Technological democratization can enable non-technology companies such as manufacturing and agriculture to also participate in the application and innovation of AI technology and promote digital transformation and upgrading in various industries. The ecological positive cycle can optimize the model by developers contributing industry data and sharing it from the model revenue, forming a collaborative network of data-model-application to promote the sustainable development of the AI industry.
It can be said that the ecological changes brought by DeepSeek have not only brought new opportunities to the development of the AI industry, but also provided new impetus for digital transformation and upgrading in various industries. In the future, with the continuous development and improvement of DeepSeek technology, its potential in ecological transformation will be further released, bringing more possibilities to the development of the AI industry.
5. 2025, the new trend of AI
In 2025, the direction of the AI implementation industry will become clearer.
In 2025, the development of AI will gradually shift from its simple worship of technology in the past to more focused on commercial and pragmatic applications. This transformation is reflected in many aspects such as technology research and development, commercialization paths, and ecological alliance construction.
In the field of technology research and development, companies have gradually realized that it is not wise to blindly pile up model parameters. Hundreds of billion-scale models are not the master key, and the success case of DeepSeek – R1 strongly proves that tens of billion-scale models can also be comparable to larger models through algorithmic optimization.
Therefore,Future R & D investment directions will focus more on reinforcement learning (RL) and model distillation technologies.
Compared with simply expanding the amount of data, RL’s self-evolution ability and the ecological value of distillation technology show greater potential in commercial applications. Through these technologies, enterprises can reduce costs while improving model performance and expand their application scenarios, thus embarking on a cost-effective path of AI and business integration.
In terms of the choice of commercialization path,The B-end market has become the focus of priority layout.
Cooperate with leading companies in various industries, such as automobile companies, hospitals, banks, etc., to jointly build industry-specific models and adopt a pay-for-performance model. This can not only achieve deep binding between enterprises and customers, but also promote the development of both parties. Collaborative cooperation in value creation.
At the same time, for small and medium-sized customer groups, companies should not ignore their potential market needs. By providing open source models and low-code platforms, providing these customers with convenient AI-capable containers can effectively reduce customization costs and meet the diversified needs of the long-tail market, thereby achieving comprehensive coverage of the entire market.
Building ecological alliances is also crucial to the development of enterprises.
On the one hand, open source core frameworks, such as DeepSeek’s open RL training tool chain, can attract developers to actively participate in ecological construction, gather wisdom and resources from all parties, and form a strong technical synergy.
On the other hand, the establishment of cross-border alliances is also indispensable. Joining chip manufacturers (such as Huawei), cloud service providers (such as Alibaba Cloud) and professional enterprises in vertical fields to form an iron triangle cooperation model of computing-model-scenario, which can promote collaborative innovation upstream and downstream of the industrial chain and create a cooperation. Win-win industrial ecological environment.
Judging from the current industry situation, although it is temporarily difficult for China’s AI model to fully surpass OpenAI in terms of general capabilities, it has every opportunity to achieve differentiated breakthroughs through deep cultivation in vertical scenarios and open cooperation in ecology.
Looking forward to 2025,The development goal of China’s AI industry is to create a number of small but beautiful industry models. These models form local advantages over large and comprehensive Western models in specific fields, and gradually penetrate and expand into the field of general intelligence through in-depth application and optimization in specific industries.
This development path can not only give full play to China’s industrial advantages in specific fields, but also provide an innovative model and solution with China characteristics for the development of the global AI industry and promote the diversified development and application of AI technology on a global scale.
Write at the end:
DeepSeek’s technological innovation and ecological openness,Transform AI from a giant game to a national co-creation.With the mutual catalysis of digitalization and AI, a flywheel with the more popular the technology, the richer the data, and the smarter the model has been formed.
However, we should be more cautious about the implementation of industrial AI. Although the emergence of DeepSeek has broken the inherent computing power and some constraints in the model, there are still many problems to be solved, such as the directional distillation of the model, such as the construction of the data system, and the cross-fertilization of interests of all parties in the ecosystem. Wait, this has long been not only a technical proposition, but also an industry-oriented industrial proposition.
However, what is certain is that the industrial tide of China’s AI model will inevitably surge and be unstoppable in 2025.