H100 rental prices fall, "GPU bubble burst"?

Huxiu
2024.10.15 01:41
portai
I'm PortAI, I can summarize articles.

The rental price of the H100 GPU has dropped from $8 per hour to $2, intensifying concerns in the market about the "GPU bubble bursting." Reasons include: companies with long-term bookings reselling idle computing power, enterprises reducing training on new models, a decrease in the number of startups focusing on large-scale base models, and the emergence of alternatives such as AMD and Intel. This information is sourced from the Latent Space website, with the original author being Eugene Cheah, and the company involved, Featherless.Ai, providing open-source AI model services

Recently, a report on "Renting H100 for $2 per hour: The Eve of the GPU Bubble Burst" has attracted high attention in the domestic market. The related article points out:

After the release of the NVIDIA H100 GPU in March 2023, due to a surge in demand and supply shortages, its rental price soared from the initial $4.7 per hour to over $8 per hour. However, since the beginning of this year, the H100 has become "oversupplied," with the hourly rental price dropping to around $2.

The price drop is caused by multiple factors: 1) Some companies that had long-term bookings for the H100 resell idle computing power after completing model training; 2) Many companies no longer train new models from scratch, but instead fine-tune open models, significantly reducing computing power demand; 3) The number of new startups focusing on building large-scale base models has greatly decreased; 4) Alternatives to the H100 have emerged, such as AMD and Intel GPUs, and so on.

Upon tracing the source of this report, it was found that mainstream overseas media and major tech media have not yet reported on it. The original article was titled "$2 H100s: How the GPU Bubble Burst" and was from a website called Latent Space, with the original author being Eugene Cheah.

According to the website introduction, Latent Space mainly focuses on AI, integrating content, blogs, and community. It is co-hosted by swyx and Alessio Fanelli. The former's social platform account does not provide a specific identity introduction, while the latter is a partner and CTO of the early-stage venture capital firm Decibel VC.

The original author, Eugene Cheah, is the CEO of the startup company Featherless.Ai.

According to Cheah's introduction at the end of the article on the "GPU Bubble," Featherless.Ai currently hosts the world's largest open-source AI model, "starting at $10 per month, immediate access, unlimited requests, fixed price; can be instantly inferred through serverless means, without the need for expensive dedicated GPUs."

I. Is the Decline in H100 Rental Prices Equal to the Burst of the GPU Bubble? The original article "GPU Bubble" includes a picture by French artist Jean-Léon Gérôme, created in 1882, titled "Tulip Frenzy".

This painting depicts the first recorded speculative bubble in history - the "Tulip Mania" in 17th century Netherlands. The price of tulips kept rising in 1634 and crashed in February 1637, leaving speculators with only 5% of their initial investment.

Will the speculative bubble from over three hundred years ago repeat itself? This question is on the minds of every AI investor, and perhaps it is also the reason why the article "H100 Rental Price Decline" has attracted high attention.

From the quotes on the Vast.ai website for computing power rental, the hourly rental price for 1x H100 has indeed fallen within the range of 2 to 3 US dollars.

Vast.ai quotes

But can the decline in H100 rental prices really be equated with the "GPU bubble bursting"?

On one hand, according to Eugene Cheah's article, describing the "H100 price decline" as "differentiation" may be more appropriate - the prices of small-scale cluster rentals continue to decline, while the prices of large-scale computing clusters may still remain at a high level.

Behind these large-scale computing clusters are often tech giants like Tesla, Microsoft, and OpenAI. Omidia data shows that in the third quarter of 2023 after the release of H100, the shipment volume reached 650,000 units, with Meta and Microsoft alone securing 150,000 units each, accounting for nearly half.

On the other hand, electronic products have update and iteration cycles, and GPU chips are no exception. Previously, there were reports of design flaws in NVIDIA's next-generation GPU Blackwell series, which could delay shipments. However, Morgan Stanley released a report last week stating that the production of Blackwell is "proceeding as planned" and the supply for the next 12 months has already been sold out, meaning customers placing orders now will only receive the goods by the end of 2025, "which will continue to drive strong short-term demand for existing Hopper architecture products" The leasing price of H100 has not plummeted suddenly, but has been fluctuating for some time. From A100 to H100, from H100 to H200, and then to the future Blackwell, the emergence of new products will inevitably lead to the decline of previous generation products, not to mention that the computing cost of Blackwell is expected to be further reduced compared to Hopper.

Jensen Huang, the "helmsman" of NVIDIA, also recently spoke out. In an interview with Altimeter Capital, he emphasized that NVIDIA's continued bullishness is completely different from the frenzy surrounding Cisco during the peak of the Internet bubble. NVIDIA is "reshaping computing" and the future will be the era of "highly machine learning".

"The Moore's Law has basically come to an end," he said, noting that in order to provide the necessary computing power to keep up with the pace of future compute-intensive software, existing data centers will need around $1 trillion worth of GPUs for upgrades in the next 4-5 years.

It must be acknowledged that the alarm bells of the "AI bubble" theory have been ringing repeatedly, with doubts about "AI investment returns falling short of expectations" growing louder. On one hand, OpenAI complains about insufficient and untimely computing power, NVIDIA's new products selling out, while on the other hand, leasing prices for computing power continue to decline, and companies are "dumping" GPUs.

However, local and short-term oversupply or shortage of computing power seems to no longer represent the overall situation of AI. For the AI field where supply and demand, long and short positions are constantly playing back and forth, perhaps more new stories are urgently needed beyond the hardware side