Jensen Huang: Blackwell is too hot, making customers dissatisfied, NVIDIA's stock price surged more than 6.5%
NVIDIA CEO Jensen Huang said that the growth rate of NVIDIA's AI chip Blackwell's supply is limited, which has left some customers feeling frustrated. He also hinted that if necessary, NVIDIA would reduce its reliance on TSMC and turn to other chip manufacturing suppliers. In addition, reports suggest that the U.S. government is considering allowing NVIDIA to export advanced chips to Saudi Arabia
The head of the AI boom leader NVIDIA, CEO Jensen Huang, said on Wednesday that NVIDIA's products have become the most sought-after commodities in the tech industry, with customers competing for limited supply, especially as the growth of AI chip Blackwell's supply is limited, leading to frustration for some customers. He also hinted that if necessary, NVIDIA would reduce its reliance on TSMC and turn to other chip manufacturing suppliers.
Speaking at a technology conference hosted by Goldman Sachs in San Francisco, he told the audience:
"Our product demand is so high that everyone wants to be the first to get it and get the most share. Today we may have more emotional customers, which is understandable. The relationship is tense, but we are trying our best."
Huang introduced to the audience that the company's latest generation of AI chip Blackwell is facing strong demand. Currently, NVIDIA outsources the production of Blackwell, and he said that NVIDIA's suppliers are trying their best to keep up with demand and making progress.
However, most of NVIDIA's revenue relies on a few customers, such as data center operators like Microsoft and Meta Platforms Inc. When asked whether the huge AI spending brings investment returns to customers, Huang said that companies have no choice but to accept "accelerated computing." He explained that NVIDIA's technology can not only accelerate traditional workloads like data processing but also handle AI tasks that old technologies cannot cope with.
Huang also mentioned that NVIDIA heavily relies on TSMC in chip production because TSMC is far ahead in the chip manufacturing industry.
But he also mentioned that NVIDIA has internally developed most of the technology, allowing the company to transfer orders to other suppliers. However, he said that such changes could lead to a decrease in the quality of its chips.
"TSMC's agility and their ability to respond to our demands are truly incredible. So we chose them because they are excellent, but if necessary, of course, we can also turn to other suppliers."
In addition, reports suggest that the US government is considering allowing NVIDIA to export advanced chips to Saudi Arabia, which may help the country train and operate the most powerful AI models. Some individuals working for the Saudi Data and AI Authority said that Saudi Arabia is working hard to comply with US security requirements to expedite the process of obtaining these chips.
After the interview content was released, NVIDIA's stock price reversed from a decline to a rise during the day, rising over 6.5% to $115.18, also driving the Nasdaq from a 1.6% intraday decline to a 1.46% rise. This year, NVIDIA's stock price has more than doubled, and it has risen by 239% in 2023.
Here are excerpts from Jensen Huang's interview:
1. First, talk about some of the ideas you had when you founded the company 31 years ago. Since then, you have transformed the company from a GPU company focused on gaming to one that provides a wide range of hardware and software for the data center industry
Jensen Huang: I want to say that one thing we did right was that we foresaw that there would be another form of computation in the future, one that could enhance general computing and solve problems that general tools could never solve. This processor would initially do things that are extremely difficult for CPUs, such as computer graphics processing.
But we would gradually expand into other areas. The first area we chose, of course, was image processing, which is complementary to computer graphics processing. We expanded it to physical simulation because in the video game field we chose, you not only want it to be beautiful, but also dynamic, capable of creating a virtual world. We expanded step by step and introduced it to scientific computing. One of the first applications was molecular dynamics simulation, and another was seismic processing, which is essentially inverse physics. Seismic processing is very similar to CT reconstruction, another form of inverse physics. So we solved problems step by step, expanded into adjacent industries, and ultimately solved these problems.
The core idea we have always adhered to is that accelerating computation can solve interesting problems. Our architecture remains consistent, meaning software developed today can run on the large installed base you leave behind, and software developed in the past can be accelerated by new technologies. This mindset about architectural compatibility, creating a large installed base, and co-evolving with the ecosystem started in 1993 and has continued to this day. This is why NVIDIA's CUDA has such a large installed base, because we have always protected it. Protecting the investment of software developers has been our company's top priority from the beginning to the end.
Looking ahead, some of the problems we have solved along the way, including learning how to become a founder, how to become a CEO, how to run a business, how to build a company, these are all new skills. It's a bit like inventing the modern computer gaming industry. People may not know, but NVIDIA has the largest installed base of video game architecture in the world. GeForce has about 300 million players and is still growing rapidly, very active. So I think every time we enter a new market, we need to learn new algorithms, market dynamics, and create new ecosystems.
The reason we need to do this is that unlike general-purpose computers, once a general-purpose computer is built, everything will eventually run on it. But we are accelerators, which means you need to ask yourself, what do you want to accelerate? There is no such thing as a universal accelerator.
2. Discuss the differences between general-purpose and accelerated computing in depth?
Jensen Huang: If you look at software today, there is a lot of file input and output in the software you write, there are parts that set up data structures, and there are some magical algorithm cores. These algorithms are different, depending on whether they are used for computer graphics processing, image processing, or something else. It could be something in the fluid, particle, inverse physics, or image domain. So these different algorithms are all different. If you create a processor that excels in these algorithms and complements the CPU in handling tasks it excels at, then theoretically, you can greatly accelerate the running of applications The reason is that usually 5% to 10% of the code accounts for 99.99% of the running time.
Therefore, if you unload that 5% of the code onto our accelerator, technically, you can increase the speed of the application by 100 times. This is not uncommon. We often accelerate image processing by 500 times. Now what we are doing is data processing. Data processing is one of my favorite applications because almost everything related to machine learning is evolving. It can be SQL data processing, Spark-type data processing, or vector database type processing, handling unstructured or structured data, all of which are data frames.
We greatly accelerate these, but to do this, you need to create a top-level library. In the field of computer graphics processing, we are fortunate to have Silicon Graphics' OpenGL and Microsoft's DirectX, but beyond these, there is no truly existing library. So, for example, our most famous library is a library similar to SQL. SQL is a storage calculation library, and we have created a library that is the world's first neural network calculation library.
We have cuDNN (a library for neural network computation), as well as cuOpt (a library for combinatorial optimization), cuQuantum (a library for quantum simulation and emulation), and many other libraries, such as cuDF for data frame processing, similar to SQL functionality. Therefore, all these different libraries need to be invented, which can rearrange the algorithms in the application to make our accelerator run. If you use these libraries, you can achieve 100 times acceleration, gaining more speed, which is amazing.
So, the concept is simple and very meaningful, but the problem is, how do you invent these algorithms, get the video game industry to use them, write these algorithms, get the entire seismic processing and energy industry to use them, write new algorithms, and get the entire AI industry to use them. Do you understand what I mean? Therefore, all these libraries, each one, first we must complete the research in computer science, and secondly, we must go through the process of ecosystem development.
We have to convince everyone to use these libraries, and then consider on which types of computers they run, each type of computer is different. So, we step by step enter one field after another. We have created a very rich library for autonomous driving cars, an outstanding library for robot development, and an incredible library for virtual screening, whether it is physics-based virtual screening or neural network-based virtual screening, as well as an amazing library for climate technology.
Therefore, we have to make friends and create markets. It turns out that what NVIDIA is really good at is creating new markets. We have been doing this for so long now that NVIDIA's accelerated computing seems to be everywhere, but we really have to complete step by step, developing markets one industry at a time.
3. Many investors on the scene are very concerned about the data center market. Can you share your views on medium to long-term opportunities? Obviously, your industry is driving what you call the "next industrial revolution"
Jensen Huang: There are two things happening simultaneously in the data center market, which are often confused but discussing them separately helps with understanding. First, let's assume a scenario without AI. In a world without AI, general computing has already stagnated. We all know that some principles in semiconductor physics, such as Moore's Law, Denard Scaling, etc., have come to an end. We no longer see the phenomenon of CPU performance doubling every year. We have been fortunate to see performance doubling within a decade. Moore's Law used to mean a tenfold increase in performance within five years, and a hundredfold increase within ten years.
However, these have now ended, so we must accelerate everything that can be accelerated. If you are doing SQL processing, accelerate it; if you are processing any data, accelerate it; if you are creating an internet company with a recommendation system, it must be accelerated. Today, the largest recommendation system engines are all accelerated. A few years ago, these were still running on CPUs, but now they are all accelerated. Therefore, the first dynamic is that the general data centers worth trillions of dollars worldwide will modernize and transform into data centers for accelerated computing. This is inevitable.
Furthermore, due to the significant cost reduction brought by NVIDIA's accelerated computing, computing power has not grown by 100 times in the past decade, but by a million times. So the question is, if your plane could be a hundred thousand times faster, what would you do differently?
Therefore, people suddenly realize, "Why don't we let computers write software instead of us imagining these functions or designing algorithms ourselves?" We just need to give all the data, all the predictive data to the computer, let it find the algorithms - this is machine learning, generative AI. Therefore, we have applied it on a large scale in many different data fields, where the computer not only knows how to process data but also understands the meaning of the data. Because it understands multiple data patterns simultaneously, it can perform data translations.
Therefore, we can convert from English to images, from images to English, from English to proteins, from proteins to chemicals. Because it understands all the data, it can perform all these translation processes, which we call generative AI. It can convert a large amount of text into a small amount of text, or expand a small amount of text into a large amount of text, and so on. We are now in the era of this computer revolution.
What is astonishing now is that the first batch of data centers worth trillions of dollars will be accelerated, and we have also invented this new type of software called generative AI. Generative AI is not just a tool, it is a skill. It is for this reason that new industries are being created.
Why is that? If you look at the entire IT industry up to now, we have been manufacturing tools and instruments for people to use. For the first time, we are creating skills that can enhance human capabilities. Therefore, people believe that AI will surpass the data centers and IT industry worth trillions of dollars and enter the world of skills So, what is a skill? For example, digital currency is a skill, autonomous driving cars are a skill, digital assembly line workers, robots, digital customer service, chatbots, and digitized employees planning the supply chain for NVIDIA. This can be a digital agent for SAP. Our company extensively uses ServiceNow, and now we have digital employee services. Therefore, we now have these digitized humans, which is the AI wave we are currently in.
4. There is an ongoing debate in the financial markets about whether the investment return is sufficient as we continue to build AI infrastructure. How do you evaluate the return on investment that customers are getting in this cycle? When you look back at history, looking at PC and cloud computing, how did the returns compare in similar adoption cycles? What are the differences compared to now?
Jensen Huang: This is a very good question. Let's take a look. Before cloud computing, the biggest trend was virtualization, if everyone remembers. Virtualization basically meant that we virtualized all the hardware in the data center into a virtual data center, and then we could move workloads across data centers without being directly tied to specific computers. The result was a doubling to two and a half times reduction in data center costs, almost overnight.
Next, we put these virtual machines into the cloud, and as a result, not just one company, but many companies could share the same resources, costs decreased again, and utilization increased again.
All the progress over the years has masked the underlying fundamental change, which is the end of Moore's Law. We gained a doubling, or more, in cost reduction from increased utilization, but we hit the limits of transistors and CPU performance.
Then, all this increase in utilization has reached its limit, which is why we now see data center and compute inflation. So, the first thing that is happening is accelerated computing. So, when you're processing data, like using Spark - which is one of the most widely used data processing engines in the world today - if you use Spark and accelerate it with NVIDIA accelerators, you can see a 20x speedup. This means you save 10x in costs.
Of course, your compute costs will go up a bit because you have to pay for NVIDIA GPUs, compute costs might double, but you reduce computation time by 20x. So, you end up saving 10x in costs. And such return on investment is not uncommon for accelerated computing. So, I recommend accelerating any work that can be accelerated, and then use GPUs for acceleration to immediately get a return on investment.
In addition, the discussion of generative AI is the first wave of AI today, where infrastructure players (like ourselves and all cloud service providers) put infrastructure in the cloud for developers to use these machines to train models, fine-tune models, protect models, and so on. **With such high demand, for every $1 spent here, cloud service providers can get a rental return of $5, and this situation is happening globally, where everything is in short supply Therefore, the demand for this type of need is extremely huge.
We have seen some applications, including some well-known ones such as OpenAI's ChatGPT, GitHub's Copilot, or the collaborative generator we use internally in our company, which have incredibly increased productivity. Every software engineer in our company now uses the collaborative generator, whether it's the one we created for CUDA, or for USD (another language we use in our company), or for Verilog, C, and C++.
So, I believe the days when every line of code was written by software engineers have come to an end. In the future, every software engineer will have a digital engineer accompanying them, assisting them 24/7. That's the future. Therefore, when I look at NVIDIA, we have 32,000 employees, but around them will be many more digital engineers, possibly 100 times more.
5. Many industries are embracing these changes. Which use cases and industries are you most excited about?
Jensen Huang: In our company, we use AI in computer graphics. Without artificial intelligence, we can no longer do computer graphics. We calculate one pixel and then infer the other 32 pixels. In other words, we "imagine" the other 32 pixels to a certain extent, they are visually stable, looking like photo-realistic, with excellent image quality and performance.
Calculating one pixel requires a lot of energy, while inferring the other 32 pixels requires very little energy and can be done very quickly. Therefore, AI is not just about training models, that's just the first step. More importantly, it's about how to use the models. When you use the models, you save a lot of energy and time.
Without AI, we cannot serve the autonomous driving industry. Without AI, our work in robotics and digital biology would also be impossible. Almost every tech bio company now revolves around NVIDIA, using our data processing tools to generate new proteins, small molecule generation, virtual screening, and other areas will also be completely reshaped by artificial intelligence.
6. Let's talk about competition and your competitive barriers. Currently, many public and private companies hope to break your leadership position. How do you view your competitive barriers?
Jensen Huang: First of all, I think there are several things that make us unique. The first thing to remember is that AI is not just about chips. AI is about the entire infrastructure. Today's computers are not about making a chip and then people buying it and putting it into a computer. That model belongs to the 1990s. Today's computers are developed under the name of supercomputing clusters, infrastructure, or supercomputers, not just a chip, not entirely a computer.
So, we are actually building the entire data center. If you go and look at one of our supercomputing clusters, you will find that the software required to manage this system is very complex. There is no "Microsoft Windows" that can be directly used for these systems. The customized software we developed for these superclusters, so the company designing the chips, building the supercomputers, and developing this complex software, naturally, is the same company, This ensures optimization, performance, and efficiency.
Furthermore, AI is essentially an algorithm. We are very good at understanding how algorithms work and how the computing stack distributes calculations, as well as how to run on millions of processors for days while maintaining the stability, energy efficiency, and ability to complete tasks quickly. We excel in this area.
Lastly, the key to AI computing is the installed base. Having a unified architecture across all cloud computing platforms and on-premise deployments is crucial. Whether you are building a supercomputing cluster in the cloud or running AI models on a device, the same architecture should be in place to run all the same software. This is what we call the installed base. And this architectural consistency since 1993 is one of the key reasons we have achieved what we have today.
Therefore, if you were to start an AI company today, the most obvious choice would be to use NVIDIA's architecture, as we are already present on all cloud platforms. No matter which device you choose, as long as it bears the NVIDIA logo, you can run the same software directly.
7. Blackwell trains 4 times faster and infers 30 times faster than its predecessor Hopper. With such a fast pace of innovation, can you maintain this rhythm? Can your partners keep up with your pace of innovation?
Jensen Huang: Our fundamental innovation approach is to ensure that we continuously drive architectural innovation. The innovation cycle for each chip is about two years, at best. We also do mid-term upgrades every year, but the overall architectural innovation happens approximately once every two years, which is already very fast.
We have seven different chips that work together in the entire system. We can introduce a new AI supercomputing cluster every year that is more powerful than the previous generation. This is because we have multiple parts that can be optimized. Therefore, we can deliver higher performance very quickly, and these performance improvements directly translate into a decrease in total cost of ownership (TCO).
The performance improvement of Blackwell means that for customers with 1 gigawatt of power, they can achieve 3 times the revenue. Performance directly translates into throughput, and throughput translates into revenue. If you have 1 gigawatt of power available, you can get 3 times the revenue.
Therefore, the return on this performance improvement is unparalleled and cannot be offset by reducing chip costs to make up for the 3 times revenue gap.
8. How do you view the dependence on the Asian supply chain?
Jensen Huang: The Asian supply chain is very complex and highly interconnected. NVIDIA's GPU is not just a chip, it is a complex system composed of thousands of components, similar to the construction of an electric car. Therefore, the supply chain network in Asia is extensive and complex. We strive to design diversity and redundancy at every link to ensure that even if problems arise, we can quickly shift production to other locations. Overall, even in the event of supply chain disruptions, we have the capability to adjust to ensure continuity of supply We are currently manufacturing at TSMC because it is the best in the world, not just a little better, but much better. We have a long history of cooperation with them, and their flexibility and scale capabilities are impressive.
Last year, our revenue saw a significant increase, thanks to the rapid response of the supply chain. TSMC's agility and their ability to meet our needs are remarkable. In less than a year, we have greatly increased our production capacity, and we will continue to expand next year and further expand the year after. Therefore, their agility and capabilities are excellent. However, if needed, we can certainly turn to other suppliers.
9. Your company is in a very favorable market position. We have discussed many very good topics. What are you most worried about?
Jensen Huang: Our company currently collaborates with every AI company globally and every data center. I don't know of any cloud service provider or computer manufacturer that we do not collaborate with. Therefore, with such scale expansion, we bear great responsibility. Our customers are very emotional because our products directly impact their revenue and competitiveness. The demand is high, and the pressure to meet these demands is also high.
We are currently in full production of Blackwell and plan to start shipping in the fourth quarter and further expand. The demand is so high that everyone wants to get the product as soon as possible to secure the largest share. This tension and intense atmosphere are unprecedented.
While it is very exciting to create the next generation of computer technology and see the innovation in various applications, we bear great responsibility and feel a lot of pressure. But we are doing our best. We have adapted to this intensity and will continue to work hard