Nvidia is preparing a version of its B200 Blackwell AI GPU for China to comply with US export regulations

Nvidia is preparing to release another China-focused GPU SKU aimed at complying with U.S. export regulations. Sources at Reuters reveal that Nvidia’s latest GPU will be an offshoot of the Blackwell B200, Nvidia’s fastest AI GPU to date. The GPU is expected to launch next year, but specifications remain an open question.

The new chip, provisionally named “B20,” will be distributed throughout China by Inspur, one of Nvidia’s main partners in the region. The B20 is said to make its official debut in the second quarter of 2025.

The specs for Blackwell’s neutered GPU are completely unknown at this point, though it seems inevitable that the B20 will be the entry-level part — a stark contrast to the B200 with its industry-leading AI performance. The U.S. has strict performance regulations for Chinese GPU exports, using a metric called “Total Processing Power” (TPP), which takes into account the number of TFLOPS and the precision of the GPU’s computational capabilities. More specifically, multiply the TFLOPS (excluding sparsity) by the precision in bits to get the TPP.

The current limit is set at 4,800 TPP. For comparison, the Hopper H100 and H200 significantly exceed that limit, hitting 16,000 TPP on both GPUs — a metric that doesn’t directly factor in memory bandwidth or capacity, which are the main improvements the H200 brings. Even the RTX 4090 exceeds the limit with 660.6 TFLOPS of FP8 compute. The most powerful Nvidia desktop GPU to fit into the 4,800 TPP limit is the RTX 4090D, which was built specifically to meet export restrictions.

Blackwell raises the bar for compute performance, with the dual-die solution potentially spitting out around 4,500 TFLOPS of FP8 compute. That would be 7.5 times the legal limit. Even the smaller B100 will deliver 3.5 PFLOPS of dense FP8 compute, or 28,000 TPP.

The B20 is also subject to additional restrictions, as the United States also enforces a “performance density” (PD) restriction specifically targeting data center GPUs (consumer GPUs are exempt from this restriction). Take the TPP result and divide it by the die size to get the PD metric; anything above 6.0 is restricted. Using this metric, any RTX 40-series GPU would be restricted for data center use, and Blackwell should improve on density and performance over the Ada Lovelace. So Nvidia will have to severely restrict B20 performance and/or use a proportionally larger die to meet the regulations. (We still don’t know the exact die size of the already-announced B200.)

We expect the B20 to be the successor to Nvidia’s entry-level A30 and H20 AI GPUs. For example, the H20 offers just 296 TFLOPS FP16, compared to 1979 TFLOPS in the H100/H200. That’s a TPP of 2368 to keep the PD below 6.0 – it has a PD score of just 2.90. Meanwhile, the A30 has a TPP score of 2640 and a PD score of 3.20. So there’s room for Nvidia to make a faster AI GPU for China… it just doesn’t too much faster.

We can’t help but feel that the B20 will be a tough chip to sell. Both Ampere and Hopper have already exceeded the performance limit, so Nvidia has created China-specific SKUs to meet regulations. All the advances in the Blackwell architecture only push it further towards non-compliance, since the maximum TPP hasn’t changed, meaning performance has to be lowered to stay compliant. The best case scenario? Nvidia would like to create a GPU with perhaps 4000-4500 TPP and an 800mm^2 die size.