What is the NVIDIA Hopper GPU Architecture?
The NVIDIA Hopper architecture is the successor to Ampere, designed to accelerate AI, high-performance computing (HPC), and data analytics workloads. Key features include enhanced Tensor Cores for AI acceleration, improved memory bandwidth, and advancements in interconnect technology. The Hopper architecture is implemented in various GPUs, including the H100, H800, and H20, each tailored to specific market needs and regulatory constraints.
NVIDIA's Hopper architecture is essential for large-Scale AI model training and deployment. Its advancements in computational power and memory bandwidth enable faster and more efficient processing of complex AI algorithms. This architecture facilitates progress in fields like natural language processing, computer vision, and scientific computing. However, its availability is affected by export controls, leading to modified versions for specific markets.
H100 vs. H800: A Key Distinction
The H100 and H800 GPUs are both based on the Hopper architecture, but they differ in specifications due to U.S. export controls. When the U.S. government first implemented export restrictions, they were determined by two factors: interconnect bandwidth and FLOPS (floating point operations per Second)
. Chips with interconnects and FLOPS above a certain level were restricted.
To comply with these initial restrictions, NVIDIA created the H800. The primary difference between the H100 and H800 was the reduction in interconnect bandwidth in the H800. Although the H800 had roughly the same computational performance (FLOPS) as the H100, its reduced interconnect bandwidth meant it could be exported to China without violating U.S. regulations .
The H100 was sold in the U.S. market, while the H800 was specifically created as a compliant alternative for the Chinese market . This allowed NVIDIA to continue serving its Chinese customers while adhering to U.S. export control laws.
The Export Control Landscape: H800 and H20
Later revisions to U.S. export controls focused solely on floating-point operations (FLOPS) as the key metric for restriction
. This meant chips with FLOPS above a certain level could not be exported to China, regardless of interconnect bandwidth.
This change led to the ban of the H800 in China, as it still exceeded the revised FLOPS threshold . In response, NVIDIA developed the H20, which reduced FLOPS to comply with the new regulations. While H20 has lower computational performance than H100, it maintains similar interconnect bandwidth. Furthermore, the H20 has better memory bandwidth and memory capacity than the H100 , making it better than the H100 in some ways.
The development of H20 illustrates NVIDIA's adaptive strategy. NVIDIA works within governmental constraints to build the best GPUs within the constraints that have been set . The company aimed to provide a viable solution for the Chinese market, even under stricter regulations.