BEIJING: Chinese artificial intelligence company DeepSeek reported that it paid $294,000 to train its R1 model, far less than reported by American competitors, in a study that will probably once again stir controversy about Beijing’s role in the competition to develop artificial intelligence.

The unusual disclosure from the Hangzhou firm – its first estimate of training expenses for R1 – came in a peer-reviewed paper in the scholarly journal Nature on Wednesday.
DeepSeek’s unveiling of lower-priced AI systems in January led global investors to dump stocks in the tech sector as they feared the new models might disrupt the dominance of AI leaders such as Nvidia (NVDA.O), opens tab.
Since then, the firm and founder Liang Wenfeng have been out of sight, at least publicly, aside from releasing a few new product updates.
The article in Nature, which included Liang as a co-author, reported DeepSeek’s logic-based R1 model took $294,000 to train and employed 512 Nvidia H800 chips. An earlier version of the article published in January didn’t include that detail.
Sam Altman, the CEO of US AI behemoth OpenAI, in 2023 indicated that what he referred to as “foundational model training” had run to “much more” than $100 million – though the company has not provided detailed costs for any of its releases.
Training expenses of large-language models that drive AI chatbots are the costs of operating a set of high-performance chips for weeks or months to work through mountains of text and code.
Parts of Deepseek’s claims regarding the cost of developing it and the technology it employed have been challenged by American companies and officials.

Its H800 chips were made specifically by Nvidia for the Chinese market following the U.S. move in October 2022 to prohibit the company from shipping its more advanced H100 and A100 AI chips to China.
US officials said in June that DeepSeek has access to “large volumes” of H100 chips, which were purchased after US export controls took effect. Nvidia said at the time that DeepSeek has utilized lawfully purchased H800 chips, not H100s.
China’s Deep Seek reports that its successful AI model was trained for only $294000
In a supporting information document that was submitted with the Nature paper, the firm admitted for the first time that indeed it owns A100 chips and that it had utilized them in development stages.
In the case of our DeepSeek-R1 research, we used the A100 GPUs to perform setup for experiments with a reduced model,” the researchers explained. Once this phase was done, R1 was trained for 80 hours total on the 512 chip-cluster of H800 chips, they continued.

Reuters has already quoted that part of the reason DeepSeek was able to recruit China’s best brains was that it was among the only domestic firms to have an A100 supercomputing cluster.
