Your Position Home AI Technology

DeepSeek, Three Questions about Souls

Wen| vb arterial network

No technology can connect to all walks of life so quickly after breaking the circle like the big language model. But before they found a suitable business path, the industry began to become entangled around parameters, costs, performance and other factors, and was deeply mired in computing power.

In January this year, DeepSeek-R1 was born and rewritten the rules of the game dominated by GPTs over the past year. With innovative model architecture and training optimization strategies, DeepSeek has confirmed to the industry that finite scale parameters can also create high-performance universal models.

In addition to breaking the monopoly of computing power, innovative designs such as DeepSeek Parameter Efficient Fine-tuning (PEFT) and Hybrid Expert Architecture (MoE) have also successfully lowered the entry threshold for large models.

With low costs combined with domestic labels, a large number of top domestic hospitals, cutting-edge medical technology companies are rapidly deploying, and even the Medical Insurance Bureau is announcing high-profile access to DeepSeek, pushing the big model to the forefront again.

Is it to follow the trend or a new path? Arterial Network recently held a dialogue with technology medical companies that have connected to DeepSeek, answering three questions one by one: the true value of DeepSeek in the medical field, the application method of DeepSeek in hospitals and the current development status of medical scenario applications based on DeepSeek.

Faced with the demand for low-cost computing power, primary medical care has become a new possibility?

Long before the birth of DeepSeek-R1, hospitals in China deployed common models and proactively launched the exploration journey of generative AI.

Since clinically relevant data could not be separated from the hospital area, the large model at that time could only be admitted through encapsulation. The problem here is that most hospitals have resource environments that are basically CPUs for general computing, and few hospitals have GPU resources for graphics processing and parallel computing, making it difficult to provide sufficient computing power.

The dilemma of computing power is closely related to cost. Among many hospitals, the best have the ability to spend a large amount of money on a full set of GPUs to completely move universal models into the hospital and serve the whole hospital’s system; a few can streamline the models to benefit their specific departments.

When most medical institutions are unable to freely configure large models and develop relevant clinical applications, the companies engaged in large medical models are not doing well. Without sufficient buyers, it is difficult for them to make sustained high R & D investment in the direction of large models.

The emergence of DeepSeek-R1 broke this status quo. With the help of innovative architecture and open source code, it fundamentally solves the cost problem caused by the deployment and operation of common models.

Wu Di, CEO of Fuxin Science and Technology Innovation, said: Since DeepSeek-R1 adopts a hybrid expert architecture (MoE), only about 37 billion parameters (total parameters of 671 billion) are activated each time, avoiding the fact that traditional dense models must have all parameters activated. The high computing cost can theoretically maintain the accuracy of reasoning while saving more than 40% of computing power consumption. If a company needs to expand the scale of the model, it does not need to linearly increase the investment in computing power to supplement the model capabilities.

DeepSeek, Three Questions about Souls插图

Comparison of capabilities of DeepSeek, GPT o1, and GPT o3 mini (input prices only count the prices during the standard period under Cache Hit, data sources: Arterial Network, Shenzhen Intelligent Medicine)

More importantly, DeepSeek has a very friendly MIT license protocol that allows users to deploy locally and freely use, copy, modify and distribute software. It also encourages enterprises to adopt and integrate in their products, encourages cooperation and innovation, and thus promotes the development of the entire ecosystem.

This open ecosystem allows ordinary medical institutions to develop large medical models that are more in line with practical application scenarios based on their own business needs. If you only deploy some small models within 100B parameters obtained by distillation, many integrated graphics cards in the hands of primary medical care can drive the models to run smoothly.

“In our communication with regional medical institutions, we found that their demands are actually clearer, hoping to use DeepSeek’s reasoning skills at the grassroots level, because there is the most shortage of doctors there who can handle complex abilities. “

In general, the value of DeepSeek-R1 lies in lowering the threshold for large-scale model applications, opening up new landing markets, and accelerating the birth of vertical applications. In the process, this emerging model gives the possibility of commercialization of the medical model.

How can medical institutions make good use of DeepSeek?

As the number of hospitals planning to deploy large models and the number of individual doctors engaged in large model development increases, many companies in the upstream position in the medical IT industry have also become active.

According to Zhao Daping, CTO of Weining Health, after the emergence of DeepSeek-R1, the domestic mainstream deployment models can be simply divided into three types. First of all, they can quickly download models from the cloud and source and quickly complete deployment. It is mainly suitable for large hospitals with existing graphics equipment. If hospitals don’t have the graphics cards they need for computing, they can go to the cloud and rent the equipment. At the same time, some private hospitals also choose to subscribe to deploy, mainly serving specific departments.

In addition, many companies manufacturing large model all-in-one machines have also spawned under the tuyere. However, in Zhao Daping’s view, if a hospital wants to realize the effective operation of the large model, it must first integrate it with the hospital information system itself. Secondly, the information system itself must use intelligent architecture that supports AI operation as much as possible.

So, how should hospitals deploy large models under ideal conditions? Zhao Daping believes that with the continuous deepening of the large model, the future hospital configuration methods must be diverse and mixed. ldquo; Hospitals may configure a large model and small models of some service segments. Large models are used for large interactive scenarios that require reasoning, thinking, and diagnosis, and small models are used for scenarios that emphasize rules, emphasize judgment, correction, and simple generation, achieving the most economical and efficient application while meeting needs.& rdquo;

“Further extending, there are many mobile scenarios in hospitals. If we can establish a small model on mobile phones, then a large amount of work in the existing medical process can be transferred to the mobile end, greatly improving medical efficiency. “

Let’s talk about doctors and other individuals who are trying to proactively develop clinical applications.

While DeepSeek became popular, various tutorials were available, encouraging users to independently configure and train models. However, in the medical field, although the emergence of DeepSeek has lowered various thresholds for model training, localized training of private models requires five steps of data preparation and processing, model selection and configuration, model training, model evaluation and tuning, and model deployment and integration. Researchers still need to have certain technical skills.

“Nowadays, the application development level of many large models is not high. Many hospital research institutions want to build an application for a specific scenario immediately after buying a card and configuring the model, but they will find that they do not have the corresponding development capabilities in actual operation. In order to achieve widespread use by individual doctors and achieve research results, we still need to wait for the service provider to upgrade the UI to further simplify the development path of large model applications. rdquo;

In other words, the joint development of vertical models between enterprises and medical institutions is still the main theme of medical AI.

Under DeepSeek, medical scenario applications start to innovate?

Although DeepSeek-R1 has achieved large-scale deployment in the medical field, its launch time is relatively short. In terms of development of application scenarios, it has not yet broken through the existing application scope of large models, focusing more on reducing deployment training costs and improving text processing efficiency. In the initial stage, a group of large model companies focusing on Internet medical care benefited first.

For example, Tencent Health connects to the DeepSeek series through Tencent Cloud, and combines it with its self-developed mixed model to quickly complete the iteration of medical services such as intelligent guidance, pre-consultation, health question and answer, image report interpretation and quality control, and accelerates its efforts to more than 1000 hospitals across the country quickly upgrade smart applications.

At present, Tencent’s Shenzhen medical insurance application has equipped its intelligent customer service with the latest AI model. Users can freely choose DeepSeek, which is good at reasoning, or Tencent Hunyuan, which can understand problems in multiple dimensions. Whether it is consulting complex policies such as “how to calculate maternity benefits” or asking professional questions such as “how to identify specific outpatient diseases,” the integrated model can combine specific insurance conditions to provide accurate and Think answers, while replying to users while helping users understand problems.

As DeepSeek accumulates more and more medical data, its application advantages in hospital scenarios are gradually emerging. Thanks to the significantly reduced requirements for prompt words and the empowerment of thought chain technology, DeepSeek effectively improves the transparency and interpretability of AI in clinical diagnosis and helps doctors communicate with models more efficiently.

For example, in the past, doctors used large models to generate surgical plans, and needed to completely and clearly state past medical history, surgical conditions and other information. However, when using DeepSeek, they only needed to input some key information, and the model would fill in relevant information on its own during the Thinking process.

In addition, medical reasoning emphasizes an evidence-based process. DeepSeek can not only provide effective diagnosis and treatment suggestions, but also clarify the reasoning process behind it in detail, including diagnosis basis, medication selection and examination items. This kind of transparency has greatly resolved doctors ‘suspicion of the AI system, provided a clear basis for doctor-patient communication, and promoted the wider application of AI technology in clinical practice.

“Many doctors pay close attention to the process of model think. They will take a rough look at Deepseek’s logic, which is an important interaction that can create trust among doctors. rdquo;

So far, many hospitals have launched large-model-related applications. Taking the writing of medical documents as an example, companies such as Fuxin Science and Technology Innovation and Weining Health have developed similar applications. Taking Fuxin Science and Technology Innovation as an example, the company has implemented AI-generated electronic medical record systems in multiple outpatient and inpatient scenarios with hospitals such as Wuhan Union Medical College Hospital and Wuhan University Zhongnan Hospital in an attempt to improve the efficiency of doctors in writing medical records.

In traditional outpatient visits by doctors, the duration of a single patient’s visit is calculated as 10 minutes. Generally, the time used to write electronic medical records is 5 minutes, the time used to prescribe medicines and examinations is 3 minutes, and the time used for consultation is also only 2 minutes on average. With AI, AI will record doctor-patient conversations in real time, convert them into medical terms, and automatically write electronic medical records according to the outpatient electronic medical record template, saving the writing time of electronic medical records.

“Based on the calculation that a doctor sees 50 patients a day, he can save at least one hour of time writing medical records every day. If the hospital uses the saved time to see more patients, the model can create real benefits for the hospital. Economic value. rdquo; Therefore, in Wu Di’s view, this is currently the most valuable and relatively easy scene to implement.

Since the DeepSeek model itself has not been fed with CT and MR-related imaging data, companies need to establish their own imaging data sets and build models when developing related applications. Therefore, compared with various text tools, there are relatively few research based on DeepSeek large models in the medical imaging field.

At present, Deep Vision has partially explored DeepSeek at the internal tool level. For example, they use DeepSeek for multi-modal standardization and enhancement of image data, using image data +meta data non-image data (EMR, HIS\RIS, DICOM header, etc. have a large amount of language information) to improve the consistency of imaging content and naming, and optimize downstream applications (such as hanging protocol, more accurate and consistent can improve doctor efficiency).

In terms of quality control data analysis, Deep Intelligence is trying to use large models to improve medical image quality control, anomaly recognition capabilities, and workflow problem interaction capabilities.

It should be noted that although imaging research based on DeepSeek is quite limited, the industry has achieved a large number of research results on large image models. Some companies have established image base models based on GPT and other models, and confirmed the accuracy and efficiency improvement of LLM in medical imaging diagnosis in clinical trials. With the further enhancement of DeepSeek’s capabilities, these companies may also slowly switch to domestic general models.

Talking about drug research and development outside the hospital scene, this is also an important arena for various large models.

At present, DeepSeek is trying to use DeepSeek to deal with medical image standardization issues, and then better solve issues such as image data quality control in medical R & D trials. According to Gong Enhao, CEO of Shenzhi Comprehensive Medical, the company has signed up with a number of international pharmaceutical companies to optimize their existing imaging test data under development.

There are also models that don’t use DeepSeek but use similar innovative technologies.

For example, Baitu Shengke’s xTrimo series of large models also adopts the Moe framework. Its V3 version can process seven modal data, including DNA, RNA, protein, cells, compound-protein interactions, protein-protein interactions, and living systems. It can realize full-scale modeling from base pairs to cell clusters, and then empower scientific research in the fields of antibodies and cell gene therapy drugs, target discovery, microorganisms and other fields.

However, it should also be noted that whether it is empowerment related to medical institutions or cutting-edge exploration of drug research and development, developers use large models such as DeepSeek to upgrade in original scenarios, and have not yet developed applications that subvert existing scenarios., there is no way to talk about innovation. Fortunately, DeepSeek- R1 has been launched in less than two months. As time goes by, we are likely to witness surprises from medical AI.

faraway and distant

Although the emergence of DeepSeek-R1 has greatly promoted the depth of application of large models in the medical field, rationally speaking, it still takes a long time to use large models in hospitals ‘daily life.

First, solving complex problems requires large models to combine patient data from various modalities to conduct comprehensive inferences like doctors do. But during the Think process, DeepSeek often falls into a situation that may be endless, resulting in a large number of answers that have nothing to do with the question itself. For a serious and high-frequency field as medical care, the illusion of these scenes must be eliminated before it can be implemented on a large scale.

Second, DeepSeek’s domestic identity certificate makes it more popular among domestic medical institutions, but to be used on a large scale, it still needs to comply with medical data privacy and security compliance. Therefore, DeepSeek needs to introduce more complete data desensitization and encryption technology to ensure the security of patient data.

Third, DeepSeek solves the product quality and performance problems that were lacking in previous large models, and fails to find killer applications to promote medical institutions to proactively pay. For now, the payment logic of AI is still related to user perception and whether the direction of the product itself can truly reduce costs, increase efficiency, and empower revenue. Therefore, if DeepSeek wants to achieve scale, one is to improve the acceptance of hospitals and doctors, and the other is to further improve it based on traditional AI. As for the issue of who pays, judging from the development of AI in the past decade, primary medical care needs the support of large models more than hierarchical hospitals.

Fourth, DeepSeek’s technological breakthroughs are not unreplicable. Today, some versions of GPT have significantly reduced the cost of model training, approaching DeepSeek’s level, and their logical reasoning capabilities have been continuously improved. This requires DeepSeek to further consolidate its advantages and achieve results on practical clinical issues.

Despite the challenges, we can still see many positive things from it. After all, the joining of a large number of medical companies and medical institutions will surely generate more vertical applications and broaden the possibility of commercialization of large models.

At the same time, the potential of models such as DeepSeek cannot be ignored. According to the iteration speed of existing large models, the general model will complete a wave of comprehensive iterations every three months. Perhaps in 2025, we will be able to see a certain model stand out, overcome the above problems one by one, and open up a new picture of the medical model together with many medical technology companies.

Popular Articles