The second wave of computing power tax, CPU price increase

Wallstreetcn
2026.01.22 12:28
portai
I'm PortAI, I can summarize articles.

As AI evolves into autonomous task-executing agents, the computing power bottleneck is shifting from GPUs to CPUs. 80%-90% of the task latency for agents comes from the CPU's sandbox environment and tool calls. IDC predicts that the demand for agents will grow 70 times in five years, leading to a demand for tens of millions of units. On the supply side, Intel and Taiwan Semiconductor's production capacity has reached its limit, with delivery cycles extended to 24 weeks. Under the supply-demand imbalance, CPU prices have risen by 10-15%

The global semiconductor market is undergoing structural changes, with the CPU sector, traditionally regarded as a mature category, becoming the focus of capital markets.

As of January 21, Intel's stock price reached a nearly four-year high, with an increase of over 44% for the year; AMD continued its upward trend; in the A-share market, Loongson Technology and Haiguang Information recorded a 20% limit-up and a single-day increase of over 13%, respectively. This market trend reflects a re-pricing of the market's perception of the "computing power tax" transmission logic. Following the surge in GPU demand due to AI training, CPUs are becoming the second wave bearer of rising computing costs.

A consensus is rapidly forming among institutions. Guolian Minsheng Securities and Western Securities recently pointed out in their reports that the current supply and demand changes in the CPU market are not cyclical fluctuations but are driven by structural changes brought about by the large-scale application of AI agents.

Unlike AI training centered on GPUs, in the workloads of intelligent agents, CPUs undertake a large amount of non-AI-native computing, including tool invocation, task orchestration, and real-time decision-making. The time spent on related processing accounts for as much as 80%-90% of the total task delay. This means that CPUs may become performance bottlenecks at the system level even earlier than GPUs.

The demand outlook has already been supported by data. According to IDC's forecast, the number of active intelligent agents worldwide will rapidly grow from approximately 28.6 million in 2025 to 2.216 billion by 2030, with a compound annual growth rate of 139%. In a neutral scenario, the long-term corresponding CPU demand may exceed 11.73 million units, forming a significant incremental market.

The supply side of CPUs is also facing extreme pressure. JP Morgan data shows that Intel's advanced process capacity utilization has reached an overloaded state of 120%-130%, while TSMC's advanced packaging capacity bottleneck has extended CPU delivery cycles from the normal 8-10 weeks to over 24 weeks.

In this trend, domestic CPU manufacturers are ushering in dual opportunities from industry and policy. CPUs, which have long been regarded as "traditional" computing components, are re-establishing their system-level value in the wave of AI agents.

AI Agents Catalyze the Restructuring of "External CPU" Demand

Traditional AI computing has placed the focus of computing power entirely on GPUs, primarily used for model training and inference acceleration. However, as AI evolves into agents with autonomous planning and execution capabilities, the structure of computing loads is undergoing fundamental reconstruction.

To complete a practical task, such as "analyzing a batch of resume data," the workflow of an intelligent agent far exceeds simple API calls. It needs to autonomously execute: creating an independent sandbox environment, accessing a designated cloud drive to download files, decompressing zip files, running data analysis scripts, generating visual reports, and finally cleaning up and releasing environmental resources. In this complete task chain, only the task decomposition and result generation stages rely on GPUs for inference, while the intermediate steps—accounting for 80%-90% of the entire process duration, including file operations, code execution, data processing, and system scheduling—are all handled by CPUs. **

Intel's white paper "AI Perspective with CPU as the Core of Intelligent Agents" clearly points out that the latency of intelligent agent workloads mainly comes from tool processing tasks on the CPU side.

Unified Intelligent Agent Architecture Paradigm: Mainstream Platforms Fully Transition to "Sandbox Execution" Mode

As AI intelligent agents move from concept to large-scale application, the industrial technology architecture is undergoing fundamental reconstruction. According to industry research by Guotai Junan Electronics, since the second half of 2025, mainstream AI platforms, including Doubao and Zhipu, have fully transitioned to the "sandbox execution" architecture model. The core of this model is to create independent, isolated virtual execution environments for each intelligent agent task to safely complete external calls such as file operations, code execution, and network access. This architectural shift has directly given rise to new characteristics of computing power demand: CPU resource consumption is strongly correlated with user scale and task concurrency, while showing weak correlation with the number of GPUs.

Breakthroughs in engineering practice provide key technical support for this architectural evolution. The DeepSeek research team demonstrated a milestone "storage-computation separation" solution in their paper: successfully storing a 100 billion parameter embedding table entirely in the CPU host memory, rather than the traditional GPU memory. Through a sophisticated PCIe asynchronous data transmission mechanism, this solution incurs less than 3% additional inference latency, achieving a critical breakthrough in engineering feasibility.

This technological breakthrough reveals two major industry trends: In terms of technical pathways, the dependence of model parameter scale on GPU memory capacity has been effectively broken, making more cost-effective host memory a viable option for large-scale parameter storage; In terms of system architecture, the role of the CPU has fundamentally changed, transforming from an auxiliary computing unit to the core hub of data scheduling and system management, undertaking key functions such as real-time retrieval, intelligent filtering, and efficient forwarding of massive parameters.

Supply-Demand Imbalance Accelerates Price Increase Expectations

The dramatic change in demand structure coincides with a dual squeeze from supply-side capacity bottlenecks.

According to TrendForce's January 2026 supply chain monitoring report, the capacity of TSMC's advanced processes such as N2 and N3 has been divided among giants like Apple, NVIDIA, and Broadcom by 2027. Due to the significantly higher "single wafer output value" of high-end GPUs and custom ASICs compared to traditional CPUs, there is a clear bias in the allocation of foundry capacity. Meanwhile, bottlenecks in advanced packaging technologies such as CoWoS have further exacerbated the supply chain—IDC analysis indicates that its capacity utilization rate had already exceeded 100% by the fourth quarter of 2025, causing CPU shipment cycles to extend from the normal 8-10 weeks to over 24 weeks.

Intel's internal ecosystem is also facing extreme pressure. As its 18A process enters the peak of mass production, the company not only needs to ensure supply for its Core and Xeon series but also fulfill commitments to external foundry customers such as Microsoft and Amazon. JP Morgan's research report points out that the capacity utilization rate of Intel's core nodes has climbed to an overloaded state of 120%-130%, forcing some non-core components to be transferred to second-tier foundries like UMC The latest industry commentary from Western Securities points out that to address the supply-demand imbalance and ensure stable supply, Intel and AMD have planned to raise server CPU prices by 10%-15%, and by 2026, the server CPU capacity of both manufacturers has "basically been pre-sold."

In summary, as AI transitions from "content generation" to "task execution," the core demand for computing power is undergoing a structural shift—from GPU-centric parallel computing to CPU-centric system scheduling and resource coordination. Under the dual influence of supply-side capacity reaching physical limits and demand-side driven by exponential growth in intelligent applications, CPUs not only face sustained upward price pressure, but their strategic value within the entire computing system is also undergoing a systematic reassessment