AI GPU demand is soaring, NVIDIA urges SK Hynix to prepare HBM4 six months in advance
NVIDIA CEO Jensen Huang has requested SK Hynix to launch the next-generation high-bandwidth memory product HBM4 six months ahead of schedule to meet the strong demand for AI GPUs. SK Hynix plans to provide HBM storage systems to major customers in the second half of 2025. NVIDIA's AI GPU market share is as high as 80%-90%, indicating an urgent need for higher performance and more energy-efficient storage systems
According to Zhitong Finance APP, Choi Tae-won, chairman of SK Group, one of South Korea's large conglomerates, stated in an interview on Monday that Jensen Huang, CEO of NVIDIA, previously requested that SK Hynix, the memory chip manufacturing giant under SK Group, launch its next-generation high-bandwidth memory product HBM4 six months ahead of schedule. SK Hynix mentioned in its financial report in October that it plans to provide HBM memory systems to major clients (speculated to be NVIDIA and AMD) in the second half of 2025. A spokesperson for SK Hynix stated on Monday that this timeline is indeed faster than the initial target, but did not elaborate further.
Jensen Huang personally requested SK Hynix to accelerate the delivery speed of the next-generation HBM—HBM4 memory system, highlighting NVIDIA's immense demand for increasingly advanced artificial intelligence GPU systems with higher capacity and more energy-efficient HBM memory systems, and further emphasizing the almost limitless "explosive demand" for AI training/inference computing power from major players in artificial intelligence, cloud computing, and the internet such as OpenAI, Anthropic, Microsoft, Amazon, and Meta, forcing NVIDIA (NVDA.US) core chip foundry TSMC (TSM.US) to work overtime to expand Blackwell AI GPU production capacity, and urging NVIDIA to accelerate the development process of the next-generation AI GPU with higher performance, larger storage capacity, stronger inference efficiency, and greater energy efficiency.
NVIDIA plans to launch its next-generation AI GPU architecture—Rubin—in 2026, and it is expected that the Rubin AI GPU will be equipped with HBM4 memory systems. According to incomplete statistics, NVIDIA currently holds an 80%-90% monopolistic market share in the global data center AI chip market, with AMD accounting for nearly 10%, and the remaining share belonging to major companies' self-developed AI chips such as Google TPU and Amazon's self-developed ASIC.
As the core HBM memory system supplier for NVIDIA's H100/H200 and the recently mass-produced Blackwell AI GPU, SK Hynix has been leading the global memory chip production race to meet the explosive demand for HBM memory systems from major clients like NVIDIA, AMD, and Google, as well as the demand for enterprise-level storage products like data center SSDs from other companies. These memory-grade chip products are essential hardware for processing massive amounts of data to train increasingly powerful artificial intelligence large models and the surging demand for cloud AI inference computing power.
However, SK Hynix also faces increasing competitive pressure from Samsung Electronics and American storage giant Micron (MU.US). Last week, Samsung stated in its financial report that it has made positive progress in reaching a supply agreement with a major client (possibly NVIDIA), after previously facing product testing delays and failing to pass NVIDIA's qualification testing in the last round of testing. Samsung added that it is negotiating with major clients and may begin mass production of "improved" HBM3E products in the first half of next year. Samsung also plans to produce the next-generation HBM—HBM4 memory in the second half of next year to keep pace with competitor SK Hynix In the United States, the domestic storage giant Micron is another HBM supplier that has obtained supply qualifications from NVIDIA, with SK Hynix being the other. In February of this year, Micron began mass production of the HBM3E storage system specifically designed for artificial intelligence and high-performance computing, stating that some of NVIDIA's H200 and Blackwell AI GPUs will be equipped with Micron's HBM3E. Subsequently, Micron's CEO has repeatedly stated that the company's HBM capacity for this year and next year has already been sold out. Micron also mentioned that it is accelerating the research and development process for the next generation of HBM4 and HBM4e.
Around the same time as an interview with Choi Tae-won, SK Hynix CEO Kwon Oh-joon stated at the 2024 SK AI Summit held in Seoul that the company plans to provide the latest 12-layer HBM3E to a major customer (speculated to be NVIDIA) by the end of this year and plans to ship more advanced 16-layer HBM3E storage samples early next year. Furthermore, this CEO revealed in the interview that NVIDIA's AI GPU supply is still struggling to meet demand, and NVIDIA has repeatedly requested SK Hynix to accelerate the supply scale of HBM3E.
The demand for HBM is explosive, akin to a "money printer."
NVIDIA founder and CEO Jensen Huang recently revealed in an interview that the Blackwell architecture AI GPU has been fully mass-produced, and the demand is extremely "crazy"; at the same time, as the demand for AI GPUs skyrockets, the demand for HBM storage systems has also surged, and it may continue to be in short supply in the coming years.
According to well-known technology industry chain analyst Ming-Chi Kuo from Tianfeng International, the latest release of industry chain order information for NVIDIA's Blackwell GB200 chip indicates that Microsoft is currently the largest GB200 customer globally, with Q4 orders surging 3-4 times, surpassing the total orders of all other cloud service providers combined.
In a recent report, Kuo stated that the capacity expansion for Blackwell AI GPUs is expected to start in early Q4 of this year, with shipments in Q4 projected to be between 150,000 and 200,000 units, and shipments in Q1 2025 expected to significantly increase by 200% to 250%, reaching 500,000 to 550,000 units. This means that NVIDIA may achieve its sales target of one million AI server systems in just a few quarters.
According to the latest forecast data from Wall Street financial giant Citigroup, by 2025, the capital expenditures related to data centers for the four largest technology giants in the United States are expected to increase by at least 40% year-on-year, with these massive capital expenditures largely tied to generative artificial intelligence, indicating that the computing power demand for AI applications like ChatGPT remains substantial. Citigroup stated that this means the giants' spending on data centers is expected to continue to expand significantly on top of the already strong spending in 2024, and the firm anticipates that this trend will provide substantial positive catalysts for the stock prices of NVIDIA, the undisputed leader in AI GPUs, and data center interconnect (DCI) technology providers The HBM storage system, in conjunction with the core hardware provided by AI chip leader NVIDIA—H100/H200/GB200 AI GPUs and a wide range of AI ASIC chips (such as Google TPU), combines to form the so-called "NVIDIA AI GPU server." HBM and AI GPUs are essential for driving heavyweight artificial intelligence applications like ChatGPT and Sora. The stronger the demand for AI GPUs, the more intense the demand for HBM storage becomes.
HBM is a high-bandwidth, low-power storage technology specifically designed for high-performance computing and graphics processing. HBM utilizes 3D stacking storage technology to fully connect multiple stacked DRAM chips together, enabling data transmission through fine Through-Silicon Vias (TSVs) for high-speed, high-bandwidth data transfer. By stacking multiple storage chips together, HBM significantly reduces the spatial footprint of the storage system and greatly lowers the energy consumption of data transmission. The high bandwidth can significantly enhance data transfer efficiency, allowing AI large models to operate more efficiently and continuously for 24 hours.
In particular, the HBM storage system has unparalleled low-latency characteristics, enabling rapid responses to data access requests. Generative AI large models like GPT-4 typically require frequent access to large datasets and perform extremely heavy large model inference workloads. The powerful low-latency characteristics can greatly enhance the overall efficiency and response speed of AI systems. In the AI infrastructure sector, the HBM storage system is fully bundled with NVIDIA H100/H200 AI GPUs, AMD MI300X AI GPUs, and the mass-produced NVIDIA B200 and GB200 AI GPUs, as well as AMD MI325X.
Goldman Sachs, a major Wall Street investment bank, released a research report stating that due to the exceptionally strong demand for generative artificial intelligence (Gen AI) from enterprises, there has been an increase in AI server shipments and higher HBM density in each AI GPU. The firm has significantly raised its total market size estimate for HBM, now projecting that the market size will grow at a compound annual growth rate of 100% from 2023 to 2026, reaching $30 billion by 2026. Goldman Sachs expects that the supply-demand imbalance in the HBM market will persist in the coming years, benefiting major players like SK Hynix, Samsung, and Micron