Intensifying competition! Report: Amazon persuades cloud customers to stay away from NVIDIA and switch to its own chips
Analysis suggests that if Amazon can shift customer spending to its self-developed server chips, which are cheaper for cloud customers due to their significantly lower power consumption compared to NVIDIA chips, this will enhance Amazon's profit margins. Furthermore, this could also prevent NVIDIA from capturing more cloud market share by directly renting its chip servers to enterprises
Like other cloud service providers, Amazon primarily rents servers powered by NVIDIA AI chips to developers and enterprises. However, media reports indicate that Amazon is now trying to persuade these customers to switch to servers driven by Amazon's self-developed AI chips.
According to The Information, Gadi Hutt, head of business development for Amazon's chip division Annapurna, stated that some tech companies, including Apple, Databricks, Adobe, and Anthropic, which are looking for alternatives to NVIDIA chips, have been testing Amazon's latest AI chips and have achieved encouraging results.
Hutt mentioned at the annual AWS customer conference, "Last year, people began to realize, 'Hey, Amazon's investment in AI chips is serious.' This week, more people believe this is a real and ongoing investment."
Analysts believe that if Amazon can shift customer spending to its self-developed server chips, it will enhance Amazon's profit margins, as these chips are cheaper for cloud customers due to their significantly lower power consumption compared to NVIDIA chips. Additionally, this could prevent NVIDIA from capturing more cloud market share by directly renting its chip servers to enterprises.
NVIDIA's dominance in the AI chip market has been difficult to shake, partly due to its chips being more powerful than those produced by competitors, including Amazon, Microsoft, and Google. At the same time, software developers are accustomed to using NVIDIA's proprietary Cuda programming language to write software for its chips.
Nevertheless, Hutt and other Amazon executives stated this week that large customers are seeking cheaper alternatives. AWS CEO Matt Garman noted that using Amazon's AI chips costs 30% to 40% less than NVIDIA's flagship H100 chips while achieving comparable performance.
Currently, Amazon has established a certain influence in developing traditional server chips and has successfully persuaded customers to rent these chips. In recent years, AWS customers have increasingly used Amazon's Graviton server chips instead of servers powered by Intel and AMD chips, as Graviton typically offers better cost performance.
For example, enterprise software company Databricks has become a significant customer of Graviton, with its executive Naveen Rao stating plans to use Amazon's new AI chips to reduce software operating costs.
Amazon's initiative to develop chips (including the AI chip Trainium) is part of its broader strategy to transform the "building blocks" of computing—from servers to cloud software—into inexpensive, generic commodities. Similarly, Amazon CEO Andy Jassy announced a new conversational AI model developed by Amazon this week, stating that its performance is comparable to the latest models from Anthropic and OpenAI, but at more than three times lower the cost.
Hutt also discussed the company's new Trainium chip and the supercomputing server cluster that AWS is building for Anthropic. Anthropic is a competitor of OpenAI and has been a significant contributor to AWS's revenue growth in recent years, currently being one of the largest users of NVIDIA servers on AWS Here is an excerpt from the media interview with Gadi Hutt:
1. Using Trainium2 (the latest version of Amazon's chip), why are you focusing on selling this chip to companies that spend heavily on NVIDIA chips?
Hutt: Customers who are concerned about machine learning costs are typically those who spend a lot, including Apple, Adobe, Databricks, and some well-funded startups like Poolside and Anthropic.
For them, the key metric is "how much performance can be obtained for every dollar spent." There are many other customers, whom we call "long-term customers," who have various projects that are very suitable for our chips. But perhaps their monthly spending is only $1,000, in which case it is not worth the engineers' time to explore this option.
In fact, at this stage of Trainium2's lifecycle, I am not looking to attract millions of customers. In terms of machine learning, we are still in a very early stage. People are still trying to tackle general artificial intelligence (AGI) and a variety of ideas, and this field is constantly evolving.
We cannot support all use cases from day one. If customers try to run something that doesn't work properly, it would be a very bad experience. Therefore, we focus on listening to the needs of our largest customers, "Hey, this is what we need," which often serves as a good forecast for the future demand of the entire market.
2. What are the goals for Trainium2 next year?
Hutt: When we deploy a large number of chips, our goal is to ensure they are fully utilized. So we first need to work with these large customers and then expand to what I call "long-term customers." For us chip manufacturers, the measure of success is ensuring that all chips are fully utilized. Whether it's 10 customers or 1,000 customers, the number is secondary.
This is a marathon, not a sprint. Over time, we hope to see more and more customers. I won't set internal targets dictating how many customers need to be enabled. We are more focused on ensuring that we provide the right tools and performance for customers, and the adoption rate will naturally increase.
3. Why was the first generation of Trainium chips not successfully promoted? What is different about the second generation?
Hutt: First of all, this is our first training chip. You can look at the comparison between Trainium1 (released in 2022) and Graviton1 (released in 2019), it's the same story. Graviton1 was actually designed to enable the entire ecosystem, including the software ecosystem, and to ensure that we built the right products for our customers.
The customers of Trainium1 (including teams within Amazon) helped us strengthen the software, but the work is still not complete. We still have a lot of work to do in supporting more workloads. However, now we can say that we are very satisfied with the workloads that Trainium2 can support, including large language models (LLM), expert models, multimodal models, and computer vision models This takes time and is quite complex. If it were easy, more people would have done it by now.
3. Are AWS customers considering renting Trainium2 or NVIDIA's Blackwell chips next year?
Hutt: Customers like to have options. Our job is to ensure that our chips remain attractive even when compared to NVIDIA's latest chips, and they certainly are at the moment.
By the way, we haven't seen the 72-chip Blackwell systems go live yet, but assuming NVIDIA can deliver, Trainium2 will still be more cost-effective.
Trainium3 (expected to be released by the end of 2025) will have four times the computing power of Trainium2, so customers are aware of our roadmap. They are confident that this is a direction worth investing in; otherwise, they wouldn't choose it.
4. Do you think the demand for NVIDIA GPUs will change?
Hutt: There are many customers who want to use NVIDIA chips and are reluctant to learn about Trainium chips. If you are a small GPU consumer using 10, 20, 30, or even 100 GPUs consistently, there is no incentive to change the status quo. Even if you could save a few thousand dollars a month, you might prefer to have engineers focus on other things.
When customers care about cost, it usually happens when they start to scale up, but there aren't many large-scale customers. So for us, these chips are a long-term investment to ensure we provide options for our customers. It's great if customers choose to use them, but if they don't, we are still the best platform for running GPUs.
Our software maturity will improve over time, and we hope that more customers will choose to use Trainium then. But GPUs are also a good business for us, and we are selling a lot. So if customers want us to provide GPUs for them, we will always do so.
5. Will using Trainium chips improve AWS's profit margins?
Hutt: We do not disclose specific profit margins, but we are not losing money on these chips. The business must have a reason to exist; otherwise, we wouldn't be investing here.
6. When did customers start showing interest in Trainium2?
Hutt: I remember the first meeting with Poolside (an AI coding assistant startup). When we showed them the specifications of Trainium2, they said, "Well, this is exactly what we need."
7. What is the power consumption of Anthropic's supercomputer cluster project Rainier?
Hutt: We do not disclose specific data. But I can tell you that it is 50% more efficient than equivalent GPUs.
8. When will the supercomputer be operational at Anthropic?
Hutt: The Rainier project will be completed soon, and we are already in construction. They can start using parts of the cluster gradually without having to wait until the last chip is online As the clusters expand, they can gradually increase usage.
9. Is Anthropic the only company that can use the Rainier project?
Hutt: Yes, it is exclusively for Anthropic.
We are building more capacity to meet the demand from other customers for Trainium. Currently, short-term demand exceeds supply. So the first quarter will be very tight, but the situation will improve as capacity increases