Investing

China’s March from Imitation to Innovation: The Case of DeepSeek

James A. Dorn

When Chinese AI firm DeepSeek released its innovative R1 model in late January—an open-source model that performs strongly with models developed at much higher cost by leading tech firms—the AI world was taken by surprise. 

US venture capitalist Marc Andreessen called R1 “one of the most amazing and impressive breakthroughs I’ve ever seen.” It is built on DeepSeek’s V3 model, released in 2024, using lower-cost chips and optimized to run predictive large language models (LLMs) with reasoning capabilities. R1 roiled tech stocks on January 27, with superstar chip maker Nvidia’s market value falling by nearly $600 billion on the expectation that demand for its top-rated chips would fall in light of DeepSeek’s ability to economize on Nvidia chips while performing at high levels.

The Rise of DeepSeek

The emergence of DeepSeek as an important player in the tech sector is due to the efforts of 40-year-old Chinese billionaire Liang Wenfeng, who used capital from his quantitative hedge fund, High-Flyer, to launch DeepSeek in May 2023. As a nonstate/​private enterprise, DeepSeek made progress on its own by hiring talent from some of China’s leading universities and paying highly competitive salaries. The firm was set up as a research operation to advance AI models and eventually to build models to match human learning processes (known as Artificial General Intelligence or AGI). 

Early work led to LLMs V1 and V2, but the real breakthrough that brought worldwide attention to DeekSeek was the release of V3 in December 2024 and R1 a month later. The training costs for V3 were less than $6 million, far below the costs of training LLMs at major AI firms. AI models consist of algorithms and the data used to train them. Training consists of “the process of feeding an AI model curated data sets to evolve the accuracy of its output” (Chen 2023). 

The key features of V3 that make it attractive are its open-source AI, its exceptional processing speed and efficiency, and its ability to handle complex coding, math, and text creation. It is a versatile model that is expected to revolutionize AI. DeepSeek’s chatbot app, based on V3, is now a leader in the field, and V3 has distilled reasoning capabilities from R1, making it an even stronger model. 

When the United States imposed export controls in October 2022 to restrict China’s access to advanced semiconductors that might compromise national security, Chinese AI firms were determined to find ways to make progress without access to Nvidia’s most advanced chips. Liang Wenfeng surprised everyone by using less powerful chips along with innovative engineering to produce efficient AI models that could compete with OpenAI and other leaders in the field at much lower cost. 

One of the key innovations was to use an 8‑bit floating point (FP8) “mixed precision training framework” for DeepSeek’s V3 model. As Dirox reported, “This was the first time this framework has been used on such a large-scale model.” By doing so, DeepSeek economized on memory and achieved a dramatic increase in computation speed. 

Other features mentioned in the Dirox report that contribute to the success of V3 as a foundational AI model are:

“Mixture-of-Experts (MoE) Architecture,” which uses only the neural network (“expert”) needed to address a specific topic rather than tying up all the networks and parameters in the model for each task.
“Multi-Token Prediction (MTP),” which allows LLMs like V3 to speed up the time it takes to generate text.
“Multi-head Latent Attention (MLA),” which allows LLMs to capture key information from a body of text multiple times rather than just from a single sentence.

In addition, making V3 open-source means that its code is available to anyone for free and can be further refined to improve AI models, thus increasing the scope of knowledge available to individuals—even though DeepSeek’s transmission of information on sensitive political topics is restricted by the Chinese Communist Party (CCP).

From Follower to Innovator

Until DeepSeek’s innovations, it had long been held that only AI giants (like OpenAI, Google DeepMind, and Meta) could develop and run high-performance AI models. The popular mentality was that bigger models with more Graphics Processing Units (GPUs) perform better than lower-cost models, and that only wealthy AI firms could properly train the top models. 

That thinking turned out to be a myth with the development of DeepSeek’s V3/R1 models. As data scientist Sahin Ahmed notes, “By proving that smarter engineering can outperform brute-force computing, Liang Wenfeng has forced Big Tech to rethink their approach.” (On the technical innovations ushered in by DeepSeek’s models, see Dirox 2024 and Ahmed 2025.)

In an interview with media site 36 Kr in July 2024 (“We’re Done Following. It’s Time to Lead.”), Liang revealed his thinking on China’s march from imitation to innovation in the AI field. 

The following quotes from Liang’s interview are pertinent:

“We won’t go closed-source. We believe that establishing a robust technology ecosystem matters more.”
“More investment doesn’t necessarily result in more innovation. If that were the case, big tech companies would have monopolized all innovation.”
“DeepSeek remains entirely bottom-up. We also do not pre-assign roles; natural division of labor emerges. Everyone brings unique experiences and ideas, and they don’t need to be pushed. When they encounter challenges, they naturally pull others in for discussions. However, once an idea shows potential, we do allocate resources from the top down.”
“If someone has an idea, they can tap into our training clusters anytime without approval. Additionally, since we don’t have rigid hierarchical structures or departmental barriers, people can collaborate freely as long as there’s mutual interest.”
“The restructuring of China’s industrial landscape will increasingly rely on deep-tech innovation.”
“Hardcore innovation will only increase in the future. It’s not widely understood now because society as a whole needs to learn from reality. When this society starts celebrating the success of deep-tech innovators, collective perceptions will change. We just need more real-world examples and time to allow that process to unfold.”

At times, Liang sounds like F.A. Hayek when he speaks of the importance of “bottom-up” experimentation in designing AI models and the “natural division of labor that emerges.” Likewise, when Hayek talks about the competitive market process as a “discovery procedure,” it fits in well with Liang’s view of innovation. 

Nevertheless, Liang is restricted in what he can say about the use of knowledge in society and a true spontaneous order, like the free market, based on private property and the rule of law. DeepSeek’s LLMs avoid answering any questions that touch on sensitive political topics, such as any criticism of Xi Jinping or the CCP or what really happened at the Tiananmen protests. China’s lack of a free market for ideas is bound to hinder innovation, both in the quality of data used to train models and the interpretation of results.

China’s Future: Open or Closed Society? 

It is exciting to see DeepSeek develop open-source models that others can use to improve their own models, including “reasoning” models like R1. The fact that DeepSeek is a private/​nonstate research firm run by a self-made billionaire who favors the spread of knowledge bodes well for China’s future. 

The process of scientific discovery and innovation, however, cannot itself turn China into an open society where individuals are free to choose and express their ideas without the threat of political reprisal. As long as China’s leaders maintain the CCP’s monopoly on power and suppress freedom of thought, improvements in AI will be hampered by the state, as will the emergence of a harmonious society. 

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button
Close
Close