Your Position Home AI Technology

The first batch of DeepSeek developers have begun to flee

Wen| Alphabet List, Author| Ma Shuye, Editor| Zhao Jinjie

Behind the busy replies from the DeepSeek service, it is not just the anxious wait of ordinary users. When the API interface response breaks through the critical threshold, the world of DeepSeek developers also has a constant and turbulent butterfly effect.

On January 30, Lin Sen, an AI developer connected to DeepSeek in base Beijing, suddenly received an alarm from the background of the program. Before he could be happy about DeepSeek’s release for a few days, Lin Sen’s program was forced to paralyze the background for three days because it could not call DeepSeek.

At first, Linson thought this was due to insufficient balance in his account at DeepSeek. It wasn’t until he finished working after the Spring Festival holiday on February 3 that he finally received a notice from DeepSeek to suspend API recharge. At this point, although the balance in his account was sufficient, he could no longer call DeepSeek.

The third day after Lin Sen received the notification from the background, DeepSeek officially issued an announcement on February 6, announcing the suspension of API service recharge. Nearly half a month has passed, and as of February 19, the API recharge service of DeepSeek’s open platform has not returned to normal.

The first batch of DeepSeek developers have begun to flee插图

Illustration: The DeepSeek developer platform has not resumed rechargeSource: Screenshot of the Alphabet List

After realizing that the background paralysis was due to overloading of DeepSeek servers,As a developer for several days without receiving any advance notice or after-sales maintenance services, Lin Sen felt like he had been abandoned.

“It’s like there is a small shop in front of your house. You are a regular customer. You have applied for a card and always get along well with the boss. Suddenly one day, the restaurant was rated as a Michelin restaurant. The owner left the old customers aside and refused to recognize the cards they had previously issued. rdquo; Lin Sen described it.

As the first developer to deploy DeepSeek in July 2023, Lin Sen is excited about DeepSeek’s exit from the circle, but now, in order to maintain operation, he can only switch to ChatGPT. After all, although ChatGPT is a little more expensive, it is at least stable. rdquo;

When DeepSeek transformed from a small store passed down by word of mouth into a Michelin restaurant where Internet celebrities checked in, more developers who called without doors like Linson began to flee DeepSeek.

In June 2024, the Xiaowindow AI answering machine was connected to DeepSeek V2 in the early stages of the product. What surprised Xiaowindow partner Lou Chi was that at that time, DeepSeek was the only one who could recite “Yueyang Tower” in full without making mistakes. Big model. Therefore, the team used DeepSeek to assume one of the most core functional roles of the product.

But for developers,Although DeepSeek is good, its stability is always lacking.

Lou Chi toldAlphabet List (ID: wujicaijing)During the Spring Festival, not only were C-side users busy accessing, but developers were often unable to call DeepSeek. The team decided to choose several large model platforms that were already connected to DeepSeek to call them at the same time.

After all, there are already dozens of platforms with full-blooded versions of DeepSeek R1.& rdquo; Using R1 on these large model platforms, together with Agents and Prompt, can also meet user needs.

In order to compete for the developer community that has spilled out of DeepSeek, some leading cloud vendors have begun to hold frequent activities for developers.“If you participate in the event, you will send computing power for free. If you don’t call in large quantities, small developers can use it almost for free.& rdquo;Yang Huichao, technical director of Yibiao AI, said.

However, DeepSeek is currently popular,Even as the first batch of developers fled, more developers are still flocking to them, hoping to grab the former’s traffic dividends.

The project Xi Jian started is an AI companion APP that role-plays by calling DeepSeek’s API. It gained about 3000 active users in its first week of launch on February 2.

Although some users reported that there were errors when DeepSeek’s API calls, 60% of users already hope that Xi Jian will launch the Android version as soon as possible. In Xi Jian’s social media backstage, at least dozens of users send private messages to download links every day. The AI companion platform built on DeepSeek has undoubtedly become a new label for APP to go out of the circle.

According to the Alphabet List statistics, the list of various types of apps connected to DeepSeek included in the DeepSeek official website has only 182 lines before 2025, but now it has expanded to 488 lines.

On the one hand, DeepSeek has become a domestic celebrity, flooding 100 million users in seven days. On the other hand, the first batch of developers deployed on DeepSeek have switched to other large models due to the busy services caused by overloaded traffic.

For developers, long-term service exceptions are no longer simple failures, but have evolved into cracks between the code world and business logic. They are forced to perform survival calculations at the cost of migration. Whether it is influx or escape, developers all need to face the aftershocks caused by the DeepSeek explosion.

01

Mini programs were forced to be paralyzed for three days during the Spring Festival. On the sixth day of the Lunar New Year, Lin Sen left DeepSeek, which had been deployed for more than a year, and switched back to ChatGPT in order to ensure the normal operation of the program.

Even though the price of API calls is nearly 10 times higher, ensuring the stability of the service at this time has become a higher priority option.

It is worth noting that it is not as easy for developers to leave DeepSeek and switch to other large models as users switching call models within the app.“Different large language models, even different versions of the same language model, have subtle differences in the feedback results for prompt words. rdquo;Even though Linson continued to call ChatGPT, migrate all key nodes from DeepSeek to ChatGPT, and ensure stable and high-quality content feedback, it still took him more than half a day.

Switching the action itself may only take two seconds, but for more developers, changing to a new model will take a week to repeatedly adjust the prompt words and repeat the test. quo; Linson told Alphabet List,

In the view of young developers like Lin Sen, the shortcomings of DeepSeek’s server are understandable, but if they can be notified in advance, many losses can be avoided, whether it is time costs or APP maintenance costs.

After all,“You need to register your mobile phone number to log in to the DeepSeek developer backend, and you only need a text message to inform the developer in advance. rdquo; These losses will now be borne by the developers who supported them when DeepSeek was unknown.

When developers are deeply coupled to a large model platform, stability has undoubtedly become a contract that does not need to be announced. A frequently fluctuating service interface is enough to allow developers to re-examine their loyalty to the platform.

Just last year, when Linson called Mistral Model (French Head Model Company), he made repeated payments due to an error in the Mistral billing system. After he sent the email, Mistral corrected the problem in less than an hour and attached a voucher of 100 euros as compensation. This kind of response also gave Lin Sen more trust. Now, he has also relocated some of his services back to Mistral.

Yang Huichao, technical director of Yibiao AI, began to plan an escape after the release of DeepSeek V3.

Don’t use DeepSeek to write poems or complain, what if DeepSeek was used to write tenders? Yang Huichao, who is in charge of the company’s AI bidding project, has begun to find alternatives after DeepSeek launched the V3 version. For him, in a professional field such as bidding, DeepSeek’s stability is becoming increasingly insufficient. rdquo;

The popular reasoning ability of DeepSeek R1 version does not appeal to Yang Huichao. After all,“As a developer, the main reasoning capabilities of software rely on programs and algorithms, and do not rely too much on the basic capabilities of the model. Even if the oldest GPT 3.5 is used at the bottom level, relying on algorithmic corrections can produce a good result.The model only needs to restore stable answers.& rdquo;

In the actual call process, DeepSeek seems to be more like a smart but lazy student in Yang Huichao’s eyes.

After upgrading the V3 version, Yang Huichao found that DeepSeek has a higher success rate in answering some complex questions, but its stability has also climbed to an unacceptable level. Now it asks 10 questions, and the output of at least one of them is unstable. In addition to the content required to be generated, DeepSeek often likes to play freely and generate additional content that is irrelevant to the question. rdquo;

For example, error characters are not allowed in the bid. At the same time, for the results returned by the large model, developers often specify that the Json structure is used (instructions are used to stably return fixed fields each time the large model is called) to output data to facilitate subsequent function calls. However, errors or inaccuracies will cause subsequent calls to fail.

“DeepSeek R1 may have improved its reasoning capabilities a lot compared to previous V3 versions, but its stability does not reach commercial levels. rdquo;In the @ Productivity Mark account, Yang Huichao mentioned.

The first batch of DeepSeek developers have begun to flee插图1

Illustration: Garbled code occurred during the generation process of DeepSeek V3Source: @ Productivity Mark Account

As the first batch of users to join during the DeepSeek-coder period in early 2024, Yang Huichao does not deny that DeepSeek is a good student. However, now, in order to ensure the quality and stability of the generated tenders, Yang Huichao can only turn his attention to other domestic large model companies that are more inclined to B-end users.

After all, DeepSeek, once known as the AI industry, quickly gathered a group of small and medium-sized AI developers with its cost-effective label. But now if you want to call DeepSeek directly and stably, you must deploy it locally.“Deploying a DeepSeek R1 costs 300,000 to 400,000 yuan. If I use online APIs to calculate, I will never use up 300,000 yuan in my life.& rdquo;

It is neither cheap nor stable enough. Yang Huichao, who have no way to call, is leaving DeepSeek in batches.

02

Once upon a time, Linsens were the first people to firmly choose DeepSeek.

In June 2024, when Lin Sen was developing his own AI Mini programs for Youth Listening to the World, he compared dozens of large model platforms at home and abroad at that time. He needs to use a large model to process thousands of news items every day, and filter and sort them to find technological and natural news suitable for young people to listen to, and process the news text.

This not only requires the big model to be smart, but also cheap.

It involves processing thousands of news items every day, which consumes a lot of token. For Linson, an independent developer, the ChatGPT model is very expensive and is only suitable for processing core aspects. Rapid screening and analysis of large amounts of text requires support from other large-scale models with lower prices.

At the same time, whether it is Mistral, Gemini, or ChatGPT abroad, calls are very cumbersome: you need to have a specific server abroad, also need to be a relay station, and you need to use a foreign credit card to purchase tokens.

Lin Sen was able to recharge his ChatGPT account through his British friend’s credit card. And once the server is overseas, the API response speed will also be delayed.This made Lin Sen turn his attention to China, looking for a ChatGPT alternative.

DeepSeek surprised Lin Sen.& ldquo; DeepSeek was not the most famous at the time, but it had the most stable feedback.” rdquo; Taking an API call requested every 10 seconds as an example, other large domestic models may not return anything 30% of the time within 100 times, but DeepSeek returns every time and can maintain the same response quality as ChatGPT and BAT’s large model platforms.

Compared to the price of large model API calls by ChatGPT and BAT, DeepSeek is really too cheap.

After Lin Sen handed over a lot of news reading and preliminary analysis to DeepSeek, he found that DeepSeek’s call cost was 10 times lower than ChatGPT.After instruction optimization, the cost of calling DeepSeek per day is as low as 2-3 yuan,“It may not be the best compared to ChatGPT, but DeepSeek’s price is extremely low. For my project, its value for money is very high.& rdquo;

The first batch of DeepSeek developers have begun to flee插图2

Illustration: Lin Sen used a large model to collect news and analyze it (left) and finally presented it in the Youth Listening World Mini programs (right)Picture source:Provided by Lin Sen

Cost performance has become the primary reason why developers choose DeepSeek.In 2023, Yang Huichao initially switched the company’s AI project from ChatGPT to Mistral, mainly to control costs. Subsequently, in May 2024, DeepSeek launched the V2 version, which increased the API to 2 yuan per million tokens. This was undoubtedly a dimension reduction blow to other large model manufacturers. This also became the reason why Yang Huichao switched the company’s AI bidding tool project to DeepSeek.

At the same time, after testing, Yang Huichao found that the platform of domestic B-side is too heavy.

For startups like Yibiao AI, if they choose BAT, they will face bundled consumption of cloud services.For Yang Huichao, who simply calls the large model service, DeepSeek’s API call is undoubtedly more convenient.

DeepSeek also outperforms migration costs.

Whether it is Lin Sen or Yang Huichao, the initial APP development is based on the OpenAI interface form. If you switch to BAT’s large model platform, the underlying layer will have to be redeveloped again. However, DeepSeek is compatible with the OpenAI-like interface. Switching large models only requires modifying the platform address, and switching painlessly in 1 minute.& rdquo;

DeepSeek was equipped on the first day of official sales of the small window AI question-and-answer machine, and the roles of language and composition guidance among the five core characters, were handed over to DeepSeek for construction.

As a partner, Lou Chi was also amazed by DeepSeek in June last year. ldquo;DeepSeek has a great ability to understand Chinese and was the only big model at that time that recited the full text of “Yueyang Tower” without making mistakes.” rdquo; Lou Chi told Alphabet List,Compared with the regular and class-style document-based output of other large models, using DeepSeek to teach children to write compositions often wins the imagination of writing.

Before social media became popular to use DeepSeek to write poetry and science fiction novels, DeepSeek’s gorgeous writing style caught the eyes of the Xiaowindow AI team.

For developers, they are still looking forward to DeepSeek resuming calls. At present, whether they are migrating to the platform where BAT has deployed a full-blooded version of DeepSeek R1, or moving to other large model manufacturers, they seem to be the most important thing.

03

But competitors are trying to match DeepSeek’s out-of-circle strengths in deep reasoning.

Domestically, Baidu and Tencent have recently added in-depth thinking capabilities to their self-developed large models; abroad, OpenAI also urgently launched a new Deep Research launch in February, using the thinking capabilities of the reasoning large model for online search, and will be available to Pro, Plus and Team users. Google DeepMind also released the Gemini 2.0 model series in February, of which the 2.0 Flash Thinking experimental version is a model that enhances reasoning capabilities.

It is worth noting that DeepSeek still focuses on text reading, but whether it is ChatGPT or Gemini 2.0, in addition to supporting deep thinking, they have introduced reasoning capabilities into multimodal, supporting multiple input modes such as video, voice, documents, and pictures.

For DeepSeek, while catching up with multimodal, the bigger challenge also comes from the approaching price of competitors.

On the cloud platform deployment side, a number of leading cloud vendors have chosen to connect to DeepSeek to share traffic while relying on cloud services to bind customers.Calls to the DeepSeek model have even become a gift bound to enterprise cloud services to some extent.

Robin Li, founder of Baidu, recently proposed that in the field of big language models, reasoning costs can be reduced by more than 90% every 12 months. rdquo;

With the trend of declining reasoning costs, it is inevitable that the API call prices of BAT will continue to fall. DeepSeek’s cost-effective advantage is facing the pressure of a new round of price wars among major manufacturers.

However, the price war for the big model API is only the beginning. For developers, the big model manufacturers are also fighting for services.

Lin Sen has been exposed to many large model platforms, large and small. What impressed him was that a major technology company would have a special account manager to connect with them. Whether it was unstable or technical problems, they would proactively contact the developer.

Although as an open source large model platform, the goal is to provide developers with more inclusive AI support, DeepSeek does not even have an entry on its official website to issue invoices to developers.

“Every time the API is recharged, unlike other large model platforms, you can issue invoices directly in the background. DeepSeek needs to bypass the official website and add customer service enterprise WeChat to issue invoices. quo; Yang Huichao told Alphabet List that whether it is price or service,DeepSeek ‘s cost-effective label seems to be a little shaky.

The AI product manager of a leading manufacturer told Alphabet List that some Internet company leaders insist on replacing the original large model with DeepSeek, regardless of the time it takes to replace the model and readjust Prompt. At the same time, even with the full-blooded version of DeepSeek R1, there are many common capabilities such as Function calling that are not supported.

Compared with BAT people who have used cloud services to run B-end service scenarios, DeepSeek is still far behind major AI manufacturers in terms of convenience.

However, DeepSeek’s traffic effect has not faded yet, and there are still many trendsetters.

Some companies claim to connect to DeepSeek, only to start calling the API and recharge a few hundred yuan. Some companies announced the deployment of the DeepSeek model, but in fact they just let employees read the tutorial on Station B and download a one-click installation package.In this wave of DeepSeek craze, mud and sand are mixed, and good and bad are mixed.

The tide will eventually fade, but DeepSeek obviously has more homework to do.

Popular Articles