You can be willful when you have money, but there is still a lot to do to become the “strongest”.
Image source: Generated by Unbounded AI
What Musk calls the smartest AI Grok 3 on earth is here.
In a live broadcast watched by millions of people, Musk released Grok 3. He was also involved in the launch by two Chinese researchers, xAI co-founders Tony Wu and Jimmy Ba. Judging from benchmark tests, Grok3 is indeed surprisingly strong, and judging from capital investment, the computing power cluster of 200,000 GPUs behind it is also staggering.
The release of Grok 3 includes a series of models: Grok 3, Grok 3 mini, as well as updates such as Think, DeepSearch, and Big Brain.
#01. The name of the smartest AI comes from the list. How about the actual measurement?
In terms of benchmarking, Grok3 outperforms other models such as GPT-4o, Gemini-2 Pro, Claude3.5 Sonnet, DeepSeek-V3 and other models in benchmark tests in mathematical reasoning, STEM and science. Even the small version of the Grok 3 Mini is at the top of the line.
Early versions of Grok 3 also scored high in the Chatbot Arena, a crowdsourced testing platform where different AI models compete with each other and users vote for the best answer. Grok-3 is the first to break 1400 points. Model, ranking first in all categories.
Since Grok was released in 2023, MMILU scores have improved rapidly, especially in 2024, reaching a significant breakthrough in Grok 2, showing rapid catch-up and progress compared with the GPT series.
“Grok 3 has very powerful reasoning capabilities, so in the tests we have conducted so far, Grok 3 has outperformed any released product we know of, which is a good sign, Musk said via video call at the World Government Summit in Dubai last week.
Grok3 has also introduced the reasoning model (Think). Through Grok3 Reasoning and Grok3 mini Reasoning, it is possible to think like reasoning models such as DeepSeek- R1. Grok 3’s model can solve complex problems by considering all possible solutions, self-criticizing, verifying solutions, backtracking, thinking from first principles, etc. However, in order to prevent distillation, part of Grok 3’s reasoning process has been blurred.
Grok 3 Reasoning has surpassed o3-mini’s best version, o3-mini-high, in multiple popular benchmarks, including the new mathematics benchmark AIME2025.
The team demonstrated using Grok 3 ‘s Think mode to generate an animated 3D drawing of a launch from Earth to Mars and back to Earth, showing the trajectory of the next launch window.
In the demonstration, Grok 3 provided a Python script that uses Matplotlib and explained the code. The code seems to solve Kepler’s law numerically. After the code was run, Grok animated the two planets of Earth and Mars, using green balls to represent the spacecraft’s journey between them.
The demonstration was generated on-site, so there was no verification that the solution was completely correct, but Musk, wearing a pendant showing the Earth-Mars transfer orbit, said it was close to the actual solution.
Andrej Karpathy, who experienced Grok 3 in advance, said that Grok 3 ‘s Think mode achieves tasks that DeepSeek-R1, Gemini 2.0 Flash Thinking and Claude failed to achieve, but he said that top OpenAI models such as o1-pro can also do the same.
After OpenAI, Gemini, and perplexity, Grok also launched its own deep search, Deep Search. The xAI team has positioned Deep Search as the next generation search engine, the first generation of Grok Agent. It’s more than just a simple information retrieval tool, designed to help program, research and answer daily questions.
From the demonstration, Grok3 ‘s Deep Search is not much unique. It emphasizes that it is different from the keyword matching model of traditional search engines. It can deeply understand the semantics and intentions of user queries, and obtain information from multiple sources. Obtain content, cross-validate to ensure accuracy, and is more controllable than traditional search engines, allowing users to specify sources.
The xAI team specifically mentioned that the Deep Search search process is transparent to users and allows users to understand the AI thinking process.
#02. Full Blood Big Brain Mode
For more complex queries, use the Big Brain pattern to reason with more calculations. xAI describes these reasoning models as best suited to mathematical, scientific and programming problems, which looks like a full-blooded version of the alternative.
The xAI team demonstrated that Grok 3 created a new game in Big Brain mode that combines Tetris and Bejeweled. The xAI team explained that because it was improvised during live broadcasts, Grok may make some minor coding errors, causing the game to not run exactly as expected. In the live broadcast test, the generated game could run normally, but there were some problems with the color display of the game. In addition, it was unclear whether the mechanism of Tetris clearing an entire line was implemented.
The xAI team also confirmed its plan to launch an AI game studio during the live broadcast. Musk also posted a related tweet on X the day before.
#03. You can be willful when you have money, but there are still many things to do to become the strongest
Grok 3 ‘s xAI-based Colossus cluster took only 122 days to build the first phase of 100,000 cards, and it took another 92 days to expand to 200,000 cards. About 200,000 GPUs were used to train Grok 3, and pre-training was completed in early January. Musk previously posted on the X platform that Grok 3 was developed using 10 times more computing resources than its predecessor, Grok 2, and that the training dataset had been expanded to allegedly include documents from court cases. During the live broadcast, he said that Grok 3 has about 15 times the computing resources of Grok 2.
Musk also revealed that xAI is building a new AI cluster that will have five times the power of the current cluster.
In addition, regarding the voice model, the team did not give a specific release date, but Musk said it would be released in about a week.
In specific details, speech will be generated directly from a model similar to Grok, which can understand what is said and generate audio directly. This approach allows AI to remember details and continue the conversation more naturally. Voice mode functionality will be provided in both the application and API.
xAI plans to launch Grok-3 ‘s API in the next few weeks. This API will include Grok-3 ‘s inference model and Deep Search capabilities. The xAI team is very looking forward to enterprise-level application scenarios and believes that the powerful capabilities of Grok-3 and the addition of Deep Search will bring great value to enterprise users.
It is worth noting that xAI has also recently launched an activity to give away an API quota of US$150 as long as you agree to share data, with a minimum recharge of US$5. Obviously, xAI doesn’t care about giving up this wool, but more about obtaining users and data in this way.
Regarding the open source plan, Musk said that he would continue his previous strategy and open source Grok 2 when Grok 3 matures and is stable (which will probably be implemented within a few months).
Currently, users can experience it through X and Grok’s websites and apps, but not all Grok 3 models and related functions are already online (some are in the testing stage). Grok 3 will be launched first to Premium+ subscribers on the X platform, and a stand-alone subscription service called Super Grok will also be launched to provide Grok users with the most advanced features and earliest access rights for $30 per month or $300 per year. SuperGrok unlocks features such as more queries in DeepSearch and also provides unlimited image generation services.
The release of Grok3 marks fierce competition for xAI in the AI field, not only including competition from OpenAI and Google, but also facing pressure from emerging China companies. For example, DeepSeek has asked AI companies around the world to adjust their strategies to make deep thinking models the standard. It has also prompted OpenAI to recently open its reasoning model for free and has also begun to signal open source.
For Musk, OpenAI may be xAI’s biggest enemy. Musk founded xAI in 2023 with the goal of becoming an alternative to OpenAI and has publicly criticized OpenAI’s plans to restructure itself into a for-profit company.
Musk has also filed two lawsuits against OpenAI, accusing it of deviating from its original founding principles and proposing to buy OpenAI’s nonprofit arm for $97.4 billion, but the proposal was rejected by OpenAI’s board of directors last week. Ultraman said the offer was a strategy to slow us down. Although Musk was involved in the founding of OpenAI, he has been critical of the company since leaving the board in 2018.
Both companies are making stunning fundraising and their valuations are soaring. According to Bloomberg reported last week, Musk’s xAI is in financing negotiations for approximately US$10 billion. After the financing is completed, the company’s valuation will reach US$75 billion, compared with xAI’s previous valuation of US$51 billion. At the same time, OpenAI is in talks to raise up to $40 billion in funding and is expected to increase its valuation to $300 billion.
The rich wealth brought by the use of capital is also obvious. SoftBank, OpenAI, Oracle and Abu Dhabi-backed MGX jointly announced plans in January to invest US$100 billion in the United States and will eventually invest US$500 billion to build data centers and other artificial intelligence infrastructure. At the same time, Dell Technologies is close to completing a deal worth more than $5 billion to provide xAI with servers optimized for artificial intelligence.
Judging from the current situation, OpenAI is indeed xAI’s main competitor. The two have a direct competitive relationship in terms of technology, market positioning and financing strategies. OpenAI remains in a leading position with its mature product line and strong market share. Although the release of Grok 3 has advantages in certain indicators, judging from the overall demonstration, there is not much innovation, and more to complement and catch up with the industry’s leading companies. What really supports Grok 3 seems to be more the 200,000 GPUs and continuous capital support than the real technological breakthrough. This release is not what Musk said. Maybe this is the last chance for AI to surpass Grok.” rdquo;
At the beginning of the release of Grok 3, Musk once again introduced xAI and Grok’s mission: to understand the nature of the universe, understand what is happening, look for traces of aliens, explore the meaning of life, understand the origin of the universe, and determine how it ends. Driven by the pursuit of truth, xAI has become the ultimate truth-seeking artificial intelligence.
However, whether it is to realize these grand visions or face more realistic competition, it is obviously not enough to rely solely on money-making capabilities and the strongest title on the list. To truly become the smartest AI on earth, Musk and its xAI still has a long way to go.
Welcome to join the official social community of Shenchao TechFlow
Telegram subscription group: www.gushiio.com/TechFlowDaily
Official Twitter account: www.gushiio.com/TechFlowPost
Twitter英文账号:https://www.gushiio.com/DeFlow_Intern