HTSC: NVIDIA GTC 2024 kicks off, suggesting to pay attention to optical modules, switches, and liquid cooling new changes

Zhitong
2024.03.21 23:54
portai
I'm PortAI, I can summarize articles.

HTSC released a research report, suggesting to pay attention to the new changes in NVIDIA's optical modules, switches, and liquid cooling. Huang Renxun announced several new computing/networking products and architectures during his speech, with optical modules and switches expected to drive demand growth. In addition, the application of liquid cooling technology in data centers is also expected to increase. HTSC believes that in the rapid iteration of the AI industry, opportunities for development in industries such as optical modules, optical devices, and switches are worth paying attention to

According to the information from the Zhitong Finance and Economics APP, HTSC's research report stated that on March 19th, Beijing time, NVIDIA kicked off with Huang Renxun delivering a speech on "Witnessing the Moment of AI Transformation" and releasing several new computing/network products and architectures. The report highlights the following points: 1) Optical modules: In the GB200 cluster, the external interconnection bandwidth of a single GPU has been further improved, which is expected to drive the accelerated release of 1.6T optical module demand; 2) Switches: NVIDIA emphasizes both Ethernet and InfiniBand, achieving end-to-end 800G throughput globally for the first time. The report is optimistic about the application potential of Ethernet switches in the future inference era; 3) Liquid cooling: The GB200 rack will be equipped with a liquid cooling system, saving 20KW of power consumption. The endorsement of chip-side backing + server-side expansion + operator-side vision is expected to jointly promote the implementation of liquid cooling technology, benefiting related equipment manufacturers and IDC.

The report believes that with the rapid iteration of the AI industry, the demand for computing power in the industry chain is expected to continue to rise rapidly. In this context, it is recommended to focus on industry development opportunities such as optical modules, upstream optical devices & engines, MPO (high-density connectors), switches, etc. In addition, the penetration rate of liquid cooling solutions in data centers is also expected to continue to increase.

HTSC's main points are as follows:

Optical modules: GB200 makes a grand debut, further improving the external interconnection bandwidth of a single GPU

At this conference, the latest generation GB200 chip made a grand debut, and the market is paying attention to the impact of the GB200 architecture on the demand side of optical modules. In May 2023, NVIDIA released the GH200, with a high ratio of 1:9 between a single GPU in a cluster of 256 and an 800G optical module. According to Huang Renxun's presentation at this conference, the latest released GB200 chip further upgrades the external interconnection bandwidth of a single chip from the previous 900GB/s to 1800GB/s (bidirectional). In terms of clusters, a single cabinet can accommodate up to 72 Blackwell GPUs, and through the new generation NVLink5 switch, 576 GPUs can be interconnected; further expansion to over ten thousand GPUs can be achieved through IB/Ethernet switches. Considering only the 576 GPU cluster under NVLink, the report calculates that the ratio of 1.6T optical modules matched by a single Blackwell GPU reaches 1:9.

Switches: The trend of AI network high-speedization continues, and 800G products are expected to gradually increase in volume

NVIDIA released the Quantum-X800 InfiniBand and Spectrum-X800 Ethernet switches, creating a network platform that achieves end-to-end 800GB/s throughput globally for the first time. Compared to the previous generation, the Quantum-X800 has a 5x increase in bandwidth capacity under SHARPv4 and a 9-14.4TFlops increase in network computing power. In addition, NVIDIA's GPU interconnection has been upgraded to NVLink 5, enabling the connection of 576 GPUs in a single NVLink domain, with each GPU communicating at a bidirectional throughput of 1.8TB/s The bank noted that the trend of AI network speeding up continues, and Nvidia still focuses on Spectrum Ethernet products, preparing for the upcoming inference market; as domestic Ethernet switch manufacturers continue to iterate their products, they are expected to achieve significant growth in the future inference market with Ethernet technology accumulation and cost-effectiveness.

Liquid Cooling: GB200 debuts with liquid cooling system, industry participants jointly promote technology implementation

Nvidia's GB200 rack features 2 miles of NVLink wiring with a total of 5,000 cables, consuming 20KW of power. Huang Renxun stated, "In order to make these calculations run quickly, a liquid cooling design will be adopted, with cooling liquid input/output water temperatures of 25℃/45℃." According to Dell'Oro's forecast, the global liquid cooling market is expected to reach nearly $2 billion by 2027. The bank sees that under the endorsement of the global AI chip leader, recent server manufacturers have also successively deployed or expanded production of liquid-cooled racks (such as Foxconn's participation in the design of the GB200 liquid-cooled rack, and Supermicro's announcement to expand production of liquid-cooled racks in Q2). Coupled with the industrial vision that more than 50% of projects from the three major domestic operators will use liquid cooling in 25 years and beyond, the penetration rate of liquid cooling is expected to continue to increase.

Risk Warning: Macroeconomic impact; intensified industry competition; new technology advances below expectations