DeepSeek Discloses A Theoretical Cost-Profit Ratio of 545%

On March 1st, the official certified account of DeepSeek published an article titled “DeepSeek-V3/R1 Inference System Overview” on Zhihu, revealing for the first time the core optimization solution of the model inference system and disclosing a theoretical profit margin as high as 545%, setting a new record in profitability in the global AI large model field and causing a stir in the industry.

The article shows that the optimization goal of DeepSeek-V3/R1 inference system is to achieve greater throughput and lower latency.

To achieve these two goals, DeepSeek utilizes large-scale cross-node expert parallelism (EP). Firstly, EP significantly increases batch size, thereby improving efficiency of matrix multiplication on graphics processing units (GPUs) and enhancing throughput. Secondly, EP disperses experts across different GPUs so that each GPU only needs to compute few experts, resulting in fewer memory access requirements and reducing latency.

However, EP also adds complexity to the system. Therefore, the article explains how to increase batch size using EP, how to hide transmission delays, how to perform load balancing effectively.

DeepSeek also disclosed key information such as the theoretical cost and profit margin of DeepSeek.

The article states that from 12:00 on February 27th to 12:00 on February 28th Beijing time, the total number of nodes occupied by DeepSeek V3 and R1 reasoning services peaked at 278 nodes, with an average occupancy of 226.75 nodes (each node has 8 H800 GPUs). Assuming a GPU rental cost of $2 per hour, the total cost is approximately $87,100 per day.

If all tokens are priced according to DeepSeek R1’s pricing model, theoretically the total revenue for one day would be around $562,000, with a cost-profit margin of 545%.

The data disclosed by DeepSeek this time not only verifies the commercial feasibility of its technical route, but also heralds that the profit loop of AI large models has moved from ideal to reality. The training cost of the previously released DeepSeek-V3 model was only $5.576 million, which is 1%-5% of similar products.

DeepSeek’s release of this article on Zhihu attracted nearly 600 comments and over 5000 likes. Some netizens said that today’s technical article “Overview” is a “source code Easter egg,” directly revealing their trump card. Some netizens praised: “Too powerful, AI computing power needs to be cheap enough like hydroelectric power, and Deepseek has taken a big step.”

The release of the this article also marks the official conclusion of the globally watched “DeepSeek Open Source Week.” From February 24th to February 28th, various latest technological advancements were gradually open-sourced during this week. This includes four open-source projects such as FlashMLA, DeepEP, DeepGEMM, and 3FS, as well as code libraries like DualPipe and EPLB.

SEE ALSO: DeepSeek Announces Nighttime API Call Price Reduction, with Discounts Up to 75%