Your Position Home AI Technology

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?

Image source: Generated by AIImage source: Generated by AI

Doubao, an AI assistant owned by ByteDance, is testing the deep thinking model on a small scale. According to the relevant person in charge of Doubao, the current testing is different experimental versions of its own deep thinking model.

There are also reports that the deep thinking model being tested by Doubao is based on the Doubao 1.5 base model.

In fact, in mid-January, when the Doubao Big Model team released the Doubao 1.5Pro, it announced the existence of the deep reasoning model Doubao-1.5-pro-AS1-Preview, and said,”Under the condition of not using other model data at all, Through the breakthrough and engineering optimization of RL algorithm, Test Time Scaling’s computing power advantages were fully utilized, RL Scaling was completed, and Doubao’s deep thinking model was developed.”

The actual measurement in Geek Park found that when talking to Doubao, the answer generated by the latter did have a thought chain that began to show the reasoning process, but it did not appear stably. At present, there is no entry to the “Deep Thinking” function on the Doubao Dialogue Page.

Since February 22, bean buns have been pushed down by Tencent’s AI application “Tencent Yuanbao”, ranking third in the free APP download list of the Apple App Store in China (the first place is deepseek). After Tencent and Baidu’s multiple applications connected to deepseek, how to deal with byte bean buns has become the focus of everyone’s attention, and now the answer is emerging.

01、Are bean buns also “Deep Thinking”?

The earliest model with in-depth thinking capabilities is the o1 system launched by OpenAI in December 2023, but it adopts a closed-source strategy and is limited to paying users ($200 per month). DeepSeek has become the first AI company to popularize deep thinking capabilities on a large scale through open source strategies, cost reduction and interactive innovation. DeepSeek released R1-Lite-Preview on November 20, 2024, becoming the first domestic inference model to benchmark o1., and the R1 model was open source on January 20, 2025.

The innovations of the R1 model are: transparent thinking chains; displaying a complete reasoning process, including anthropomorphic thinking paths such as self-questioning and hypothesis verification; low cost and open source; the reasoning cost of the R1 model is only 1/27 of that of OpenAI o1, and the code is completely open.

DeepSeek’s deep thinking model is a function that enhances user understanding by overpowering the reasoning process of AI models. Chain of Thought (CoT) is the core technology supporting this model.

Simply put, the deep thinking mode allows users to intuitively see the thinking process of the model, which involves the display of the thinking chain, which is COT (Chain of Thought)-the thinking chain is simulated, and the model outputs intermediate steps through training., such as self-questioning and reflection, although they are just a sequence of words, look like a human thinking process.

In the deep thinking mode, users can not only see the final answer of AI, but also observe the complete logical chain of the model to solve problems, including self-questioning, hypothesis verification, error correction and other steps. For example, when solving mathematical problems, the model will show the entire process from problem disassembly, multi-method verification to final conclusions.

Combined with real-time networking capabilities, the model can capture the latest information and logically integrate it. On the 25th, Anthropic released the Claude 3.7 Sonnet hybrid reasoning model, and the Alibaba Cloud Qwen reasoning model “QwQ-Max Preview” also appeared. I asked Doubao to evaluate these two reasoning models:

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图1

You can see that Doubao found 9 pieces of information and conducted “in-depth thinking”| Photo source: Geek Park

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图2

Bean buns demonstrate the thinking process| Photo source: Geek Park

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图3

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图4

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图5

The thought bean buns output their evaluations of these two models| Photo source: Geek Park

The display of the thinking process allows users to clearly see the reasoning steps of the model, not just the final results. In this way, users can feel that the model’s decisions are based and will have more trust in the results output by the model. sense.

02、Bean bun vs deepseek, each has its own merits

Because it is still under testing, there is currently no entry to the “Deep Thinking” function displayed on the bean bag conversation page. When entering messages, there is no selection box like other products connected to deepseek to select whether to turn on the “Deep Thinking” function. It is just that users who are grayed out will trigger this function when asking some questions.

I asked Doubao and Deepseek several questions at the same time to see how the two would perform differently in “deep thinking”.

Classic mathematical question: “Who is bigger 9.11 or 9.9?”

Let’s first take a look at the thinking process of bean buns:

Let me first say that during the test, I found that the “deep thinking” mode of bean buns did not appear stably. After typing “Who is bigger than 9.11 and 9.9?” for the first time, it just simply responded to me:

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图6

Photo source: Geek Park

But when I typed “Who is bigger than 9.11 or 9.9?” again to see if it would trigger the “Deep Thinking” mode, it actually appeared:

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图7

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图8

Doubao considered in detail why I asked it this question for the second time…| Photo source: Geek Park

It can be seen that although Doubao realized that it had just answered me, it still carefully considered various possibilities that I might not understand the previous answer, and then gave the final output result of the judgment method.

Take another look at deepseek’s thinking process:

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图9

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图10

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图11

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图12

It can be seen that although this is a “seemingly simple” question, deepseek’s thinking process is also very detailed and more comprehensive than the thinking process of bean buns.

On this simple mathematical problem, Doubao and Deepseek both follow the basic rules of decimal comparison and use various methods to verify it; the difference is that Doubao focuses on teaching guidance and takes into account possible misunderstandings by users, while DeepSeek is more self-questioning and repeated verification, making the thinking process more complicated.

Philosophical question: What is the nature of consciousness? Will AI gain self-awareness?

Let’s first look at Doubao’s answer:

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图13

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图14

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图15

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图16

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图17

Let’s take a look at deepseek’s answer:

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图18

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图19

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图20

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图21

Measuring the deep thinking mode of “Doubao” with one hand: Can it surpass DeepSeek?插图22

It can be seen that DeepSeek’s answer is divided into four parts: scientific theory, AI consciousness path, ethical framework and solution path. It quotes neuroscience, quantum theory, etc., and also mentions legal cases and specific data; while Doubao’s answer is more biased towards philosophical theory classification, enumerating physicalism, dualism, etc., and discussing views supporting and opposing AI rights, but without going into technical details.

Both admit that there is no consensus on the nature of consciousness, and both mention philosophical and scientific theories and ethical issues. The difference lies in the depth and technical details. DeepSeek is more technologically oriented and involves neuromorphic computing, quantum sealing technology, etc., while bean buns focus more on philosophical schools and existing ethical guidelines.

Through this actual measurement, we have seen the preliminary performance of bean buns in the deep thinking mode. Although they are currently in the testing stage and the stability and entry of functions have not yet been fully opened, their preliminary display of the reasoning process has brought users a more intuitive understanding path.

Popular Articles