Wallstreetcn
2024.06.03 11:47
portai
I'm PortAI, I can summarize articles.

The industrial chain update is accelerating comprehensively! NVIDIA announces "One Iteration per Year" | AI Dehydration

HBM4 memory and 3.2T optical modules may become mainstream by 2026

Author: Zhang Yifan

Editor: Shen Siqi

Source: Hard AI

At Computex 2024, Jensen Huang, holding the Blackwell chip, once again demonstrated NVIDIA's full-stack capabilities.

Compared to GPU sellers, full-stack manufacturers need to consider not only GPUs, but also software platforms, network services, cooling products, supporting CPUs, and other products.

NVIDIA CEO Jensen Huang provided detailed answers at this conference.

Annual Chip Iteration: Blackwell Ultra GPU to be launched in 2025, Rubin GPU in 2026, and Rubin Ultra GPU in 2027;

Next-Generation Architecture: Next-generation architecture Rubin to be launched in 2026;

Spectrum-X "Yearly Update": In 2026, Spectrum-X1600 can connect millions of GPUs;

Cooling Beyond "Liquid Cooling": The Blackwell architecture introduces both air-cooled DGX and liquid-cooled MGX servers;

Software Platform: Software business is not only NVIDIA's moat, but will also become a huge business;

I. Processors

First, let's take a look at NVIDIA's processors, including GPUs and CPUs.

At the conference, Jensen Huang said, "The update rhythm will be on an annual cycle going forward, pushing all products to their technical limits."

He also revealed the future three generations of technology stack (see the image below):

• Blackwell Ultra GPU to be launched in 2025 (8S HBM3e 12H);

• Rubin GPU to be launched in 2026 (8S HBM4), along with the new Arm-based Vera CPU, and NVLink 6 Switch (3600GB/s);

• Rubin Ultra GPU to be launched in 2027 (12S HBM4);

Specific parameters for Rubin GPU and Vera CPU performance have not been disclosed yet. However, NVIDIA has already demonstrated significant improvements in model training efficiency and cost reduction:

• Over the past 8 years, the training energy consumption of the 1.8 trillion parameter GPT-4 has been drastically reduced to 1/350, and the inference energy consumption reduced to 1/45000;

• Over the past 8 years, computing power has increased by 1000 times;

II. Processor Architecture

Huang Renxun revealed that Blackwell's next-generation architecture will be the Rubin architecture, set to debut in 2026. The new highlight will be equipped with HBM4 memory.

According to the foreign media wccftech, NVIDIA's Rubin GPU will adopt TSMC's CoWoS-L advanced packaging technology and utilize the N3 process technology.

Furthermore, NVIDIA will equip the Rubin GPU to be launched in 2026 with the next generation HBM4 memory. Currently, NVIDIA uses the fastest HBM3E memory in its B100 GPU.

This means that by the end of 2025, HBM4 memory may be mass-produced.

In addition, NVIDIA will also introduce a new generation CPU based on ARM architecture — the Vera CPU, which will be paired with the Rubin GPU to form the new Vera Rubin platform super chip. This platform will support new CX9 SuperNIC and NVLink 6 technologies, providing connection speeds of up to 1600GB/s and 3600GB/s.

III. Communication Network — Ethernet

At this conference, NVIDIA mentioned for the first time a million-level GPU interconnect Ethernet network solution, expected to be launched in 2026. By then, 3.2T optical modules may become mainstream.

"The era of millions of GPU data centers is coming!" Huang Renxun announced the future three-year Ethernet network Spectrum product roadmap and declared that new Spectrum-X products will be launched annually.

• In 2024, Spectrum-X800 is designed for tens of thousands of GPUs;

• In 2025, X800 Ultra is designed for hundreds of thousands of GPUs;

• In 2026, X1600 can be expanded to millions of GPUs;

Previously, whether it was Arista or NVIDIA, only GPU connection products at the level of tens of thousands were disclosed:

• NVIDIA: Spectrum-X has entered mass production with multiple customers, including a large cluster of 100,000 GPUs;

• Arista: The company predicts that it will be able to connect 100,000 GPUs in 2025;

According to the conference (see the figure below), the switch rate in 2026 will double compared to 2024, indicating that in 2026, optical modules may enter the era of 3.2T (currently 1.6T)

IV. Air-cooled DGX and Liquid-cooled MGX

After the release of Blackwell, there were rumors in the market that servers would use liquid cooling for heat dissipation.

At this conference, NVIDIA mentioned that they will build server products with both air-cooled DGX and liquid-cooled MGX heat dissipation modes.

In addition, compared to the previous GTC conference, Huang Renxun disclosed more detailed data on the Blackwell architecture:

• The AI computing power of DGX has increased to 45 times that of the previous generation, reaching 1440 PFLOPS, while the energy consumption is only 10 times that of the previous generation;

• The new generation DGX can accommodate 72 GPUs, supported by a backbone of NVLink with 5000 cables, saving 20kW of electricity for a rack;

V. Software Development Platform

Software business is not only NVIDIA's moat, but also a huge business.

These software businesses include: CUDA, NIM, Omniverse, etc. (see the image below).

At the conference, NVIDIA once again emphasized the importance of NIN and Omniverse,

  1. NVIDIA NIM's inference microservices can reduce the time for enterprises to deploy generative AI applications from days to minutes;

  2. Omniverse: Omniverse is a virtual world simulation development platform that minimizes the gap between simulation and reality. Developers can test, train, and integrate everything in Omniverse. As the video says, robots can learn how to be robots in the virtual world;

Looking to the future, NVIDIA has actively laid out in the field of robotics and the development of AI-based applications—Earth. Through continuous innovation and exploration, NVIDIA is expected to play a greater role in advancing global technological progress and improving human life