Targeting Nvidia! Report: Amazon is deploying 100,000 second-generation self-developed chips to challenge industry monopoly
Currently, Amazon's Trainium2 chip has begun deployment in data centers and is expected to be fully rolled out in several core data centers, including those in Ohio, soon. Compared to the previous generation, its performance has increased fourfold, memory capacity has tripled, and it has significant advantages in energy efficiency and cost
The AI competition among tech giants is in full swing. According to Bloomberg, Amazon is quietly advancing an ambitious plan aimed at challenging NVIDIA's current monopoly in the AI chip sector.
The report states that Amazon is currently ramping up the development of a new AI chip, Trainium2, at its engineering lab in Austin, Texas. Compared to the previous generation, its performance has increased fourfold, memory capacity has tripled, and it boasts significant advantages in energy efficiency and cost.
Amazon hopes to reduce the procurement costs of AI chips and enhance overall efficiency in data processing through these optimizations.
However, to truly challenge NVIDIA's leadership in the AI hardware market, Amazon still faces significant challenges.
Trainium2 Performance Significantly Improved, Testing and Delivery Planned by Year-End
Currently, Amazon's core chip design engineer Rami Sinno is leading a team to accelerate the development of the second generation of self-developed AI chips—Trainium2.
Sinno stated in an interview with Bloomberg that their goal is to deploy these chips in data centers as soon as possible, with plans to complete testing and delivery by the end of this year.
This chip is Amazon's third generation of products in the AI hardware field, aimed at providing a more efficient and cost-competitive solution for training machine learning models.
Amazon's chip business is led by James Hamilton, who was one of the pioneers in cloud computing.
Hamilton's team proposed the idea of self-developing chips as early as 2013. Amazon's first AI chip, Inferentia, was launched in 2019, focusing on inference tasks, while the Trainium series is primarily aimed at the needs of training machine learning models.
Currently, Amazon's Trainium2 chips have begun deployment in data centers and are expected to be fully rolled out in several core data centers, including those in Ohio. Amazon's goal is to cluster up to 100,000 Trainium2 chips together.
Amazon states that Trainium2 has a fourfold increase in performance compared to the previous generation, with a threefold increase in memory capacity, and significant advantages in energy efficiency and cost.
Amazon hopes to reduce the procurement costs of AI chips and enhance overall efficiency in data processing. Analysts believe that if Amazon's Trainium2 can handle more internal AI work and occasional projects from large AWS clients, it is likely to be seen as a success.
Amazon's Path in AI Hardware is Long and Challenging
Analysts believe that to truly challenge NVIDIA's leadership in the AI hardware market, Amazon still faces significant challenges.
First, designing reliable AI chips is an extremely complex task, especially when balancing performance, energy efficiency, and cost.
Second, support from software tools is equally crucial. Although Amazon's Trainium series chips have made some progress on the hardware front, they still fall short compared to NVIDIA's mature software tools (such as CUDA, etc.) Analysis suggests that currently, the Neural SDK software tools provided by Amazon are still in the early stages and cannot compete with NVIDIA's solutions.
To overcome this technological gap, Amazon is actively collaborating with large customers and partners to promote the application of its AI chips. Well-known companies such as data analytics firm Databricks and AI startup Anthropic have begun trialing Amazon's Trainium chips and have achieved preliminary results in some projects.
Tom Brown, Chief Computing Officer of Anthropic, stated:
“We are impressed by the cost-effectiveness of Amazon's Trainium chips. We have been steadily expanding their application across an increasingly broad range of workloads.”