NVIDIA launches new AI chip H200, performance soars, stock price expected to rise for the ninth consecutive time.

NVIDIA has made a major upgrade to its popular H100 GPU to solidify its market-leading position. The company has launched the new H200 GPU, which has the ability to use HBM3e high-bandwidth memory and integrates 141GB of memory. The inference speed on Llama 2 is twice as fast as the H100. Major computer manufacturers and cloud service providers are expected to start using it in the second quarter of next year.

NVIDIA has released its next-generation AI supercomputing chip on Monday evening Beijing time.

NVIDIA has made a major upgrade to its popular H100 AI GPU, and the latest high-end chip is named H200. Based on NVIDIA's "Hopper" architecture, it is the company's first GPU to use HBM3e high-bandwidth memory. This type of memory is faster and has a larger capacity, making it more suitable for processing large datasets, which is essential for developing large language models.

According to NVIDIA, based on HBM3e, the H200 provides 141GB of memory at a speed of 4.8 TB per second, which is almost twice the capacity and 2.4 times the bandwidth of the A100.

In the highly anticipated field of artificial intelligence, NVIDIA mentioned that the H200 will bring further performance leaps. The inference speed on Llama 2 (a 70 billion parameter LLM) is twice as fast as the H100. Future software updates are expected to bring additional performance advantages and improvements to the H200.

The H200 will be available in NVIDIA HGX H200 server motherboards with four-way and eight-way configurations, and it is compatible with the hardware and software of the HGX H100 system.

Major computer manufacturers and cloud service providers are expected to start using the H200 in the second quarter of next year. Amazon's AWS, Google Cloud from Alphabet, and Oracle's cloud infrastructure have all promised to use this new chip starting next year.

Ian Buck, Vice President of NVIDIA responsible for large-scale and high-performance computing, said, "To create intelligence through generative AI and high-performance computing HPC applications, large and fast GPUs that can efficiently process large amounts of data are necessary. With the H200, the industry-leading end-to-end AI supercomputing platform can solve some of the world's most important challenges faster."

NVIDIA stated that with the new product, the company is trying to keep up with the scale of datasets used to create AI models and services. The addition of enhanced memory capacity will make the H200 faster when bombarded with data, which is the process of training AI to perform tasks such as image recognition and speech. According to the head of NVIDIA's data center products, "When you observe what is happening in the market, you will find that models are rapidly expanding. This is another example of us rapidly launching the latest and most advanced technology."

NVIDIA initially followed the slight decline in the US stock market in early trading, but quickly surged, rising by about 1.4% and is expected to rise for the ninth consecutive trading day.

With the booming demand for artificial intelligence, there is a huge demand for NVIDIA's high-end GPUs. This has also prompted other chip manufacturers to target this lucrative market and accelerate the launch of high-quality AI chips, making the competition in the entire AI chip market quite fierce. NVIDIA's move is aimed at consolidating its dominant position in the AI computing market.

AMD will launch the MI300 chip this quarter, and AMD has revealed that several hyperscale cloud service providers have committed to deploying MI300 chip products. According to a previous report by Wall Street CN, industry insiders revealed that the MI300, which has larger memory, performs better when deploying the GPT-4 model with a 32K context window. Specifically, compared to the H100, the MI300 has a performance advantage of 20%-25%, depending on the context length and prompt length/token count per query output.

In addition, Intel claims that its AI chip Gaudi 2 is faster than the H100.

Recently, the market has also been paying attention to NVIDIA's latest improved series of chips for the Chinese market - HGXH20, L20PCle, and L2PCle. According to The Paper, informed sources said that the latest three chips are derived from the H100. NVIDIA is expected to announce them after the 16th of this month, and domestic manufacturers are expected to receive the products in the next few days. Multiple industry insiders have confirmed the authenticity of NVIDIA's improved chips.

In response to this, an editorial in the Global Times commented that some people say that the US government and NVIDIA have engaged in a game of restrictions and counter-restrictions. It may seem like that on the surface, but this metaphor obscures the essence of the problem and confuses right from wrong. The several rounds between NVIDIA and the US government are a story of a high-tech company doing legitimate business, facing strong interference, constraints, and destruction from political forces against free trade, and trying their best to survive and develop. For commercial companies, this is not funny at all, and even somewhat sad. The US export control measures on chips to China are unreasonable. They are not only harmful to China's interests but also harmful to US interests. More and more people have seen this and hope that Washington will make adjustments.

NVIDIA will announce its earnings report next week. Its AI GPU has been well received this year, driving a surge in the company's performance. The market expects NVIDIA's revenue for this fiscal quarter to increase by 170%.