Your Position Home AI Technology

Who will benefit from the rise of DeepSeek?

Wen| New excerpt, author| bean bag 

In 2023, OpenAI will stand at the C position in the AI industry with its disruptive ChatGPT. At that time, the gap between domestic leading companies and OpenAI was only 2-3 months. Some people said it was 3-5 years, and some even thought it was 10 years. 2-3 Whether it has been a month or a decade, the position of China companies in the field of big models has not changed, and they are always in a catch-up state.

In 2025, the situation has changed, and DeepSeek stands at the C position in the AI industry with its quality-to-price ratio model. Around January 11, DeepSeek launched its App globally. According to Sensor Tower data, DeepSeek has accumulated 16 million downloads within 18 days of its release, while during the same period, ChatGPT has 9 million downloads. As of February 5, DeepSeek had close to 40 million global downloads, while ChatGPT had 41 million. In terms of daily activities, DeepSeek achieved a score of 22.15 million on January 31, equivalent to 41.6% of ChatGPT.

Although DeepSeek is lagging behind ChatGPT in terms of total users and daily activity, its growth rate is enough to stimulate all major model manufacturers, including the latter. At the same time, everyone is asking, how did DeepSeek do it?

In addition, today, Shen Dian, executive vice president of Baidu Group and president of Baidu’s intelligent cloud business group, mentioned at the all-staff meeting that DeepSeek will have an impact on Baidu in the short term, but in the long run, the benefits outweigh the disadvantages. He said that in the face of DeepSeek’s onslaught, Shen Dian believes that it is a bean bun with ByteDance, because its training costs and streaming costs are high, thenWho will benefit from the rise of DeepSeek? Who will be hurt the most?

1. The truth and rumors about DeepSeek

Unlike OpenAI, DeepSeek has a short time of establishment. The operating entity behind it is Hangzhou Deep Search Artificial Intelligence Basic Technology Research Co., Ltd., which was established on July 17, 2023. If you take into account magic square quantification, which is closely related to depth search, his history is longer than OpenAI.

However, in the technology industry, the length of history is not a decisive factor in the level of a company’s technical capabilities. In-depth research has proved this with two models.

At the end of 2024, In-depth Search released a new generation of big language model V3. Test results at the time showed that V3 ‘s multiple evaluation results exceeded some mainstream open source models and also had cost advantages. On January 24 this year, In-depth Quest released R1, which is the main reason why In-depth Quest has attracted global attention. According to In-depth Quest, the R1 model has achieved an important breakthrough in technology. It uses pure deep learning methods to allow AI to spontaneously emerge its reasoning capabilities. In tasks such as mathematics, code, and natural language reasoning, its performance is comparable to that of OpenAI’s o1 model. Official version.

More importantly, R1 continues the cost-effective characteristics of V3. Its model training cost is only US$6 million, while the investment scale of companies such as OpenAI and Google is hundreds of millions or even billions of dollars.

The performance is not weak + the cost is lower. These two buffs have attracted DeepSeek’s attention around the world, and they have also attracted controversy. The first controversy and the main controversy is the cost really that low?

Before DeepSeek, the industry’s model was to obtain large models with stronger performance through large-scale stacking, that is, computing power and data. Under the guidance of this logic, the big model has always been regarded as a game for giants. As the giants spend thousands of dollars, the logic has been further strengthened, and DeepSeek has broken this logic.

The currently widely circulated cost data is US$6 million. Strictly speaking, this US$6 million only refers to the cost of GPUs during the pre-training process, which is only a part of the total cost. As we all know, Nvidia is the main provider of training large model GPUs. In order to meet regulatory requirements, Nvidia has launched different versions of H100 (such as H800 and H20). Currently, China companies can only use H20. The main GPU for in-depth exploration should be H20. Others include H800 and H100.

According to calculations by Semianalysis, a well-known semiconductor research institution, Deep Search has approximately 10000 H800s and 10000 H100s, as well as a larger number of H20s. Its server capital expenditure is approximately US$1.6 billion, of which the costs related to operating these clusters are as high as US$944 million. In other words, the investment scale for in-depth research is also hundreds of millions. Even so, the cost is still significantly lower than OpenAI, Google, etc. Regarding how many GPUs were used, the in-depth search actually gave the data that R1 can be trained with 2048 GPUs, which is also lower than OpenAI.

The high and low price of query costs also shows that DeepSeek has cost advantages. Currently, the query cost per million tokens (the most basic unit of arithmetic in the AI era) in the DeepSeek R1 model is US$0.14, and the cost for OpenAI is US$7.50.

The second controversy about DeepSeek is whether OpenAI data is used. OpenAI and Microsoft are questioning it.

On January 31, OpenAI said it had found evidence that DeepSeek used their models for training, which was suspected of violating intellectual property rights. Specifically, they found signs of DeepSeek distilling OpenAI models, using the output of larger models to improve the performance of smaller models, thereby achieving similar results on specific tasks at a lower cost. Microsoft said it was investigating whether DeepSeek used OpenAI APIs.

Regarding this point, although the two practices are based on, they are not in line with the mainstream trend of the industry.

OpenAI’s terms of service stipulate that anyone can register to use OpenAI’s API, but cannot use output data to train models that pose a competitive threat to OpenAI. That is to say, DeepSeek can call OpenAI’s data, but cannot be used to train large models. However, this rule is considered by many to be a double standard because OpenAI uses a large amount of data to train large models, some of which are not authorized by the data owner, and the use of distilled data is a common practice in the industry.

In contrast, Microsoft’s approach is a better example of whether this question is tenable. He connected DeepSeek on his own AI platform a few hours after accusing DeepSeek of alleged infringement.

2. What is so special about DeepSeepk?

Ultra-low costs bring ultra-high performance, which is the biggest shock DeepSeek brings to the AI industry. Looking back at the development trajectory of China companies in other industries, they have always been good at comparing quality and price, so it is inevitable that DeepSeek will stand out.

As mentioned earlier,There was a belief in computing power in the large model industry before. No matter who they are, if they want to develop products with stronger performance, they can only choose the path of accumulating computing power and data.It is true that this strategy has opened the era of big models, and overseas OpenAI and domestic Baidu, Byte, etc. have benefited from it. Although this strategy is still working, the marginal effects may be diminishing.

Take OpenAI as an example. From 2012 to 2020, its computing power consumption doubled every 3.4 months on average, and computing power increased by 300,000 times in eight years. Sam Altman, CEO of OpenAI, said in a public interview that GTP-4 has 20 times more parameters than GTP-3 and requires 10 times more calculations than GTP-3;GTP-5 will be released from the end of 2024 to 2025, and its parameters are 100 times more than GTP-3 and requires 200-400 times more calculations than GTP-3.

If the performance of each generation can be greatly improved, the high cost will be acceptable. The problem is, if GPT-5 cannot be built this year, or if the performance cost is increased by 10 times and the performance only increases by 10%, 20%, then the followers of this model will be greatly reduced.

The reason for this situation is that OpenAI is in the dilemma of innovators. He is an industry pioneer and carries a huge cost burden. It is reasonable to choose a closed-source strategy at this time. If GPT continues to significantly improve performance, the market will continue to pay the bill.

DeepSeek technically adopts an open source strategy. The so-called open source means that the source code of the software can be provided free of charge on the Internet for modification and redistribution.If GPT-5’s performance really only improves by 10%, then many people will choose open source, helping DeepSeek become the Android of the AI era. Therefore, DeepSeek’s strategy is more universal on the premise of close performance.

Simply put, DeepSeek does not bring disruptive innovation, but his strategy provides the industry with a more universal direction, allowing people to make high-performance large models without having to pile up computing power.

Tanishq Mathew Abraham, former director of research at Stability AI, highlighted three innovations of DeepSeek in a recent blog post.

The first is the multi-head attention mechanism. Large language models are usually based on the Transformer architecture and use the so-called multi-head attention (MHA) mechanism. The DeepSeek team has developed a variant of the MHA mechanism that both uses memory more efficiently and achieves better performance. Secondly, with verifiable rewards GRPO, DeepSeek demonstrated that a very simple reinforcement learning (RL) process can actually achieve results similar to GPT-4. More importantly, they developed a variant of the PPO reinforcement learning algorithm called GRPO that is more efficient and performs better. Finally, DualPipe needs to consider many efficiency-related factors when training AI models in a multi-GPU environment. The DeepSeek team has designed a new method called DualPipe that is significantly more efficient and faster.

Zhu Xiaohu, managing director of Jinshajiang Venture Capital, said that the core of DeepSeek is that it no longer requires human intervention. It was originally RLHF (Human Feedback Reinforcement Learning), but now it is directly doing RL (Reinforcement Learning), so the cost can be done very low.

Overall, DeepSeek’s innovation lies in the reasoning process. Through engineering innovation, it optimizes the pain points of large models in the reasoning process, and greatly improves product performance.This is actually a predestined result. From daily necessities to mobile phones and automobiles, China companies have always been good at comparing quality and price. DeepSeek has continued this tradition in the field of large models.

3. Who will benefit? Who will be impacted?

There is no doubt that DeepSeek, as the second big model after OpenAI that has a major impact on the industry, will surely benefit and impact the interests of some people.

At present, the one that has been hit hardest is Nvidia, which provides GPUs. Its market value once fell by more than US$600 billion due to DeepSeek. However, this is only a superficial phenomenon. The ones most affected by DeepSeek are actually closed-source large model manufacturers led by OpenAI.

For Nvidia, DeepSeek’s alternative approach breaks the solicity-power theory of the large model to a certain extent. However, whether it is DeepSeek or OpenAI, they still need their GPUs for training. Even if other large model manufacturers switch to DeepSeek, they still rely on Nvidia. After Watt improved his steam engine in 1759, more efficient steam engines began to be widely used. This did not reduce the demand for coal. Instead, the UK’s total coal consumption index increased. This phenomenon also applies to the computing power market.

In contrast, DeepSeek has a greater impact on closed-source large model manufacturers led by OpenAI. As mentioned earlier, if OpenAI cannot prove that this Wanka cluster model can continue to help large models significantly improve performance, then it will not only be questioned by investors, but will also be abandoned by users. As a result, its business model will be difficult to get through.

DeepSeek will also have an impact on traditional search vendors. This actually happened once after the explosion of OpenAI. The logic at that time was that the efficiency and low cost of the big model would erode Google’s search market share. In the PC Internet era, search is the first killer application. It is generally believed in the industry that the first killer application in the AI era is also search.

At the same time, as DeepSeek accelerates the cycle of artificial intelligence from the training phase to the reasoning phase, this will increase the need for inference chips.

Specifically, reasoning refers to the use of artificial intelligence to make predictions or decisions based on new information, which is DeepSeek’s strength and innovation. Many industry insiders believe that as customers adopt and build DeepSeek’s open source models, the demand for inference chips and computing will increase.

Sid Sheth, CEO of artificial intelligence chip startup d-Matrix, said DeepSeek has shown that smaller open source models can be trained to be as powerful or even more powerful than large proprietary models and at a low cost. With the widespread use of small functional models, they catalyzed the era of reasoning. Therefore, as costs decrease, the adoption of AI applications may increase exponentially, and the demand for computing power in the reasoning process may explode.

It is worth noting that although DeepSeek’s model is unique, due to its open source strategy, opponents can also use his technology to develop similar products, which poses a challenge to his commercialization. At present, Li Feifei and researchers at Stanford University and the University of Washington have successfully trained s1 similar to R1 at a cost of less than US$50 (just cloud computing service fees). S1’s performance in mathematical and coding ability tests is on par with OpenAI’s O1 and R1.

DeepSeek’s achievements are worthy of attention, but in the long run, he still needs to find a suitable commercial model to go further.

Popular Articles