NVIDIA Rubin is expected to be released six months early. A new era of AI computing power is about to arrive?

Zhitong
2024.12.06 08:42
portai
I'm PortAI, I can summarize articles.

NVIDIA's next-generation AI GPU architecture "Rubin" may be released six months ahead of schedule in the second half of 2025. Although the current Blackwell architecture has not yet been shipped on a large scale, NVIDIA is still accelerating its AI GPU development to consolidate its dominance in the data center market. The Rubin architecture is expected to adopt advanced CPO technology and HBM4, offering unprecedented performance and potentially ushering in a new era of AI computing power

According to reports from media citing informed sources, the highly anticipated next-generation AI GPU architecture "Rubin" from NVIDIA (NVDA.US) may be officially released six months earlier than expected, in the second half of 2025. Although the Blackwell architecture AI GPU has not yet been shipped on a large scale and has been reported to face thermal issues, NVIDIA seems determined to accelerate its AI GPU development roadmap. In the face of fierce competition from AI chip rivals such as AMD, Amazon, and Broadcom, this "green giant" is attempting to strengthen its absolute dominance in the data center AI chip market. NVIDIA currently holds a "monopoly" in this market, accounting for 80%-90% of the share.

Although the Blackwell architecture AI GPU may not achieve large-scale production until the first quarter of next year with the concerted efforts of many core suppliers such as TSMC, Foxconn, Wistron, and Wistron, the wave of self-developed AI chips from cloud giants like Google and Amazon has made NVIDIA more committed than ever to maintaining its dominant position in the data center AI chip market. For NVIDIA shareholders, they also need new catalysts to push NVIDIA's stock price towards $200.

Many leaders in the AI industry, including OpenAI and Microsoft, as well as Wall Street investment banks like Morgan Stanley, have begun discussing how powerful the performance of NVIDIA's next-generation architecture Rubin will be. Some industry chain analysts believe that the Rubin architecture AI GPU, relying on Co-Packaged Optics (CPO) technology and HBM4, combined with TSMC's 3nm and next-generation CoWoS advanced packaging, represents "unprecedented performance" and could usher in a new era of AI computing power, with competitors potentially needing years to catch up.

According to sources within the industry chain, the product line of NVIDIA's Rubin architecture was originally scheduled for release in the first half of 2026, but has now requested the supply chain to begin early testing work, striving for an official launch in the second half of 2025. Due to the almost limitless "explosive demand" for AI training/inference computing power from AI, cloud computing, and internet giants like OpenAI, Anthropic, xAI, and Meta, NVIDIA is being forced to accelerate the development process of the next-generation AI GPU, which will have higher performance, larger storage capacity, stronger inference efficiency, and greater energy efficiency. This green giant is attempting to speed up the update pace between different AI GPU architectures.

Although NVIDIA has not officially responded, the high likelihood of the Rubin news being true is supported by the information revealed by storage chip manufacturing giant SK Hynix earlier this month regarding the potential early production and delivery of HBM4. HBM connects multiple stacked DRAM chips together through 3D stacking storage technology, enabling high-speed, high-bandwidth data transmission through fine Through-Silicon Vias (TSVs), allowing AI large models to run more efficiently and continuously for 24 hours According to reports, SK Group Chairman Choi Tae-won stated in an interview in early November that NVIDIA CEO Jensen Huang requested SK Hynix to launch its next-generation high-bandwidth memory product HBM4 six months in advance. As the core HBM memory system supplier for NVIDIA's H100/H200 and the recently produced Blackwell AI GPU, SK Hynix has been leading the global storage chip capacity race to meet the explosive demand for HBM memory systems from major clients such as NVIDIA, AMD, and Google, as well as the demand for enterprise-level storage products like data center SSDs. These storage-grade chip products are considered core hardware for processing massive amounts of data to train increasingly powerful artificial intelligence large models and the surging demand for cloud AI inference computing power.

Before the latest news about Rubin was released, NVIDIA is currently in a "one generation per year" update rhythm for its AI GPU architecture, meaning the company releases a new generation of data center AI GPU products every year, which is why there is a one-year gap between the Ampere, Hopper, and Blackwell architectures; however, this situation may change drastically for Rubin.

Insiders did not specify the exact reasons why NVIDIA plans to launch Rubin ahead of schedule, merely categorizing it as a business initiative. However, from a supply chain perspective, Rubin is expected to adopt TSMC's 3nm process and the epoch-making HBM4 in the storage field, along with potentially being the world's first data center-level AI chip to use CPO+ silicon wafer packaging. These critical core elements are either already in preparation—such as TSMC's 3nm readiness and HBM4 possibly being in the testing phase—or have been confirmed for mass production, such as CPO packaging. Therefore, given that NVIDIA may have already equipped Rubin with all the "tools," Jensen Huang may believe that releasing Rubin in 2026 is not appropriate.

According to NVIDIA's product roadmap disclosed at GTC, the upgraded version of Blackwell—the "Blackwell Ultra" product line, specifically the "B300" series—is set to debut in mid-2025. Thus, we may see the release of Blackwell Ultra very close to that of Rubin. The current release strategy is still unclear, but some professionals from Wccftech and The Verge suggest that NVIDIA may focus on the Rubin architecture and view the B300 series as a transitional product. Following NVIDIA's usual practice, the company is expected to announce more updates soon, possibly around the 2025 Consumer Electronics Show (CES).

Blackwell is already very powerful! But Rubin may usher in a new era of AI computing power.

The Blackwell architecture AI GPU series is undoubtedly the "performance ceiling" in the current AI computing infrastructure field. Before Blackwell was released, Hopper was also once regarded as the performance ceiling, and with CPO and 3nm, along with the significantly enhanced performance of HBM4 compared to HBM3E, plus the next-generation CoWoS support, Without considering the infrastructure upgrades of Rubin itself, the performance of Rubin chips may already be unimaginably strong. For NVIDIA's performance expectations, Rubin may drive Wall Street to significantly raise the fundamental outlook for 2026.

As a benchmark, Blackwell's performance is already much stronger than Hopper's. In the MLPerf Training benchmark tests, Blackwell's performance in the GPT-3 pre-training task is significantly improved by 2 times per GPU compared to Hopper. This means that with the same number of GPUs, using Blackwell can complete model training faster. For the LoRA fine-tuning task of the Llama 2 70B model, Blackwell's performance is 2.2 times better than Hopper per GPU, indicating that Blackwell has higher efficiency in handling specific high-load AI tasks. In MLPerf Training v4.1, in the benchmarks for Graph Neural Networks and Text-to-Image, Blackwell's performance is improved by 2 times and 1.7 times per GPU compared to Hopper, respectively.

According to information disclosed by informed sources and the industry chain report after Morgan Stanley's research, the Rubin architecture AI GPU plans to adopt TSMC's latest 3nm technology, CPO packaging, and HBM4; the chip size of Rubin may be nearly twice that of Blackwell, and Rubin may contain four core computing chips, which is double that of the Blackwell architecture. Informed sources revealed that the 3nm Rubin architecture is expected to enter the tape-out phase in the second half of 2025, about six months earlier than NVIDIA's previous expectations.

From the currently disclosed information, the biggest highlight of the Rubin architecture is undoubtedly the Co-Packaged Optics (CPO). The interconnect technology of Hopper and Blackwell still relies more on the improved NVLink and chip interconnect technology, rather than directly transmitting data through optical means.

Rubin is likely to be the world's first data center-level AI chip to adopt CPO + silicon wafer advanced packaging. The data transmission efficiency and energy consumption efficiency brought by CPO may present an exponential leap compared to NVLink. In the CPO packaging system, optical components (such as lasers, optical modulators, optical fibers, and photodetectors) are directly packaged with core computing chips (such as GPUs or CPUs), rather than placing optical devices separately outside the chip. The role of these optical components is to transmit optical signals, replacing traditional electrical signal transmission methods, enabling high-speed data transmission between chips, significantly reducing signal loss from the chip to the optical interface, exponentially increasing data throughput while greatly reducing power consumption. Through the high-speed transmission of optical signals, CPO can provide higher data bandwidth than traditional electrical signal transmission, which is crucial for artificial intelligence, big data, and high-performance computing (HPC) applications, especially in scenarios requiring large-scale parallel computing. Therefore, CPO packaging is considered a core highlight of NVIDIA's Rubin architecture AI GPU, which will provide extremely high bandwidth, low latency, and significantly improved energy efficiency for the next generation of AI and high-performance computing. Industry insiders believe that, due to CPO technology's ability to better address data transmission rates and power consumption issues, its application will further enhance NVIDIA's leading position in the data center AI chip market