Wallstreetcn
2024.10.14 11:55
portai
I'm PortAI, I can summarize articles.

Jensen Huang's latest 10,000-word interview: AGI is coming, AI will completely change productivity

The flywheel of machine learning is the most important. Just having powerful GPUs does not guarantee a company's success in the field of AI

On October 4th, NVIDIA CEO Jensen Huang appeared on the interview program Bg2 Pod with hosts Brad Gerstner and Clark Tang for an extensive conversation.

They mainly discussed topics such as how to expand intelligence to AGI, NVIDIA's competitive advantages, the importance of reasoning and training, future market dynamics in the AI field, the impact of AI on various industries, Elon Musk's Memphis supercluster and X.ai, OpenAI, and more.

Huang emphasized the rapid evolution of AI technology, especially breakthroughs on the path to Artificial General Intelligence (AGI). He stated that AGI assistants are about to appear in some form and will become more sophisticated over time.

Huang also shared NVIDIA's leadership position in the computing revolution, pointing out that by reducing computing costs and innovating hardware architecture, NVIDIA has a significant advantage in driving machine learning and AI applications. He specifically mentioned NVIDIA's "moat," the ecosystem of software and hardware accumulated over a decade, making it difficult for competitors to surpass through a single chip improvement.

Furthermore, Huang praised X.ai and the Musk team for completing the construction of the Memphis supercluster with 100,000 GPUs in just 19 days, calling it an "unprecedented" achievement. This cluster is undoubtedly one of the fastest supercomputers globally and will play a crucial role in AI reasoning and training tasks.

Regarding the impact of AI on productivity, Huang optimistically stated that AI will greatly enhance efficiency for businesses, bring more growth opportunities, and not lead to widespread unemployment. At the same time, he called for the industry to strengthen its focus on AI security to ensure that technology development and usage benefit society.

The key points of the entire article are summarized as follows:

  • (AGI assistants) will soon appear in some form... initially very useful but not perfect. Then over time, it will become increasingly perfect.
  • We have reduced the marginal cost of computing by 100,000 times in 10 years. Our entire stack is growing, the entire stack is innovating.
  • People think the reason for designing better chips is that it has more triggers, more bits, and bytes... But machine learning is not just software; it's about the entire data pipeline.
  • The flywheel of machine learning is the most important. You have to consider how to make this flywheel faster.
  • Having powerful GPUs alone does not guarantee a company's success in the AI field.
  • Musk's understanding of engineering and construction of large systems and resource allocation is unique... 100,000 GPUs as a cluster... completed in 19 days.
  • AI will not change every job, but it will have a significant impact on how people work. When companies use AI to increase productivity, it usually results in better returns or growth.

Evolution of AGI and AI Assistants

Brad Gerstner: This year's theme is to extend intelligence to AGI. Two years ago when we did this, we were in the AI era, which was two months before ChatGPT. Considering all these changes, it's truly incredible. So I think we can start with a thought experiment and a prediction.

If I popularly imagine AGI as a personal assistant in my pocket, if I think of AGI as that conversational assistant, I'm already used to it. It knows everything about me. It has a perfect memory of me and can communicate with me. They can help me book hotels, or make doctor appointments. Given the speed of change in the world today, when do you think we will have a personal assistant?

Jensen Huang:

It will soon appear in some form. And over time, this assistant will get better and better. That's the wonderful technology we know. So I think initially it will be very useful, but not perfect. Then over time, it will become more and more perfect. Just like all technologies.

Brad Gerstner:

When we look at the speed of change, I think Musk said that the only thing that really matters is the speed of change. We do feel that the speed of change has accelerated dramatically, this is the fastest pace of change we have seen on these issues, because we have been struggling in the field of AI for ten years, or even longer. Is this the fastest pace of change you have seen in your career?

Jensen Huang:

This is because we have reinvented computation. Many things have happened because we have reduced the marginal cost of computation by 100,000 times in 10 years. Moore's Law should be around 100 times. We have achieved this in multiple ways. First, we introduced accelerated computing, putting inefficient work on CPUs onto GPUs. We achieved this by inventing new numerical precisions. We achieved this through new architectures, inventing tensor cores, building MV Link in a systematic way, very fast memory, and expanding with MV Link and working across the entire stack. Basically, everything I described about how NVIDIA works has led to an innovative speed beyond Moore's Law.

What is truly amazing now is that since then, we have shifted from manual programming to machine learning. The magic of machine learning is that it can learn very fast. It has been proven. So when we redefined the way computation is allocated, we did a lot, all kinds of parallelism. Tensor parallelism, various pipeline parallelism. We are good at inventing new algorithms and training methods on this basis, all these technologies, all these inventions are the result of mutual superposition.

Looking back, if you look at how Moore's Law works, software is static. It is pre-compiled, like a shrink-wrapped raft placed in a store. It is static, while the hardware below grows at the speed of Moore's Law. Now, our entire stack is growing, the entire stack is innovating. So I think, now we suddenly see expansion.

This is certainly extraordinary. But what we talked about in the past was pre-trained models and the expansion at that level, and how we doubled the model size, so the data size doubled accordingly As a result, the required computing power doubles every year. This is a big deal. But now we see the expansion of post-training, we see the expansion of reasoning. So what people used to think was that pre-training was difficult and reasoning was easy. Now everything is difficult. It makes sense, but it's a bit absurd to think that all human thinking is a one-time idea. So, there must be concepts of fast thinking, slow thinking, reasoning, reflection, iteration, and simulation. Now it's emerging.

NVIDIA's Competitive Moat

Clark Tang:

I think one of the most easily misunderstood things about NVIDIA is how deep the true NVIDIA model is. There is a notion that if someone invents a better chip, they win. But the fact is, it took ten years to build a complete stack from GPU to CPU to network, especially the software and libraries that support applications running on NVIDIA. So when you talk about this, do you think today's video model is bigger or smaller than three or four years ago in terms of NVIDIA's moat?

Jensen Huang:

Well, I appreciate you recognizing how computing is changing. In fact, people think (and many still do) that the reason for designing better chips is that it has more triggers, more bits, and bytes. Do you understand what I mean? You'll see their keynote slides. It has all these triggers, bar charts, and things like that. These are all good. I mean, look, horsepower is important. Yes. So these things are fundamentally important.

However, unfortunately, these are all ideas. This is all about software running on an application on Windows and software is static in nature, right? This means the best way to improve the system is to make faster and faster ships. But we realize that machine learning is not human programming. Machine learning is not just software, it's about the entire data pipeline. In fact, the flywheel of machine learning is the most important. So how do you see me enabling this flywheel? On one hand, enabling data scientists and researchers to work efficiently in this flywheel, which started from the beginning. Many people didn't even realize that AI is needed to manage data, to teach AI. And AI itself is quite complex.

Brad Gerstner:

Is AI itself improving? Is it also accelerating? Again, when we think about competitive advantage, yes, that's right. It's a combination of all these.

Jensen Huang:

Exactly, it's because of smarter AI managing data that leads to this situation. We now even have synthetic data generation and various different ways of presenting data to it. So before you even start training, you're already dealing with a lot of data processing. So people might think, oh, Pytorch, this is the beginning of the world, and also the end of the world. This is very important.

But don't forget, before and after Pytorch, the significance of the flywheel is how you must think, how do I think about the entire flywheel, how do I design a computing system, a computing architecture, to help you leverage this flywheel, make it as efficient as possible. It's not about the size of an application training. Does that make sense? This is just one step. Well. Every step on the flywheel is difficult. So the first thing you should do is not to think about how to make Excel faster, how to make doom faster, that's the past, isn't it? Now you have to think about how to make this flywheel faster? This flywheel has many different steps, machine learning is not easy, you all know.

What OpenAI, X, or Gemini teams are doing is not easy, they are thinking deeply about us. I mean, what they are doing is not easy. So we decided, you see, this is what you should consider. This is the whole process, you want to speed up every part of it. You have to respect Moore's Law, Moore's Law shows that if this is 30% of the time, and I speed it up three times, then I haven't really accelerated the whole process. Does that make sense? You really want to create a system to speed up every step, because only by doing the whole thing, can you truly improve the cycle time and the flywheel, which is the learning rate, ultimately leading to exponential growth.

So, what I want to say is, our view of what the company is really doing will be reflected in the product. Note that I have been talking about this flywheel, the entire website. Yes, that's right. We accelerated everything.

Now, the main focus is on videos. Many people are focusing on physical AI and video processing. Imagine the front end. TB of data enters the system every second. For example, a pipeline will receive all this data. First, you need to prepare for training. Yes, this way the whole process can be accelerated.

Clark Tang:

Today people only consider text models. Yes, but in the future, it's this video model, using some text models, like o1, to really process a large amount of data before we get there.

Jensen Huang:

Yes. So the language model will involve everything. But we, this industry has spent tremendous technology and effort to train language models, to train these large language models. Now, we are using large language forms at every step. This is amazing.

Brad Gerstner:

What I hear you saying is, in a composite system, yes, the advantage will grow over time. So what I hear you saying is that our advantage today is greater than three to four years ago because we are improving every component. This is a combination, when you think about, for example, as a business case study, Intel, relative to where you are now, it has a dominant mode, dominating in the stack. Perhaps, to simplify it, compare your competitive advantage to what they had at their peak in their cycle.

Jensen Huang:

What sets Intel apart is that they may be the first company to excel in manufacturing process engineering and manufacturing. Manufacturing mentioned above is about manufacturing chips. Designing chips, building chips in the x86 architecture, making faster and faster x86 chips, this is their talent, they integrate it with manufacturing.

Our company is a bit different, we realize this, in fact, parallel processing does not require every transistor to excel, serial processing requires every transistor to excel. Parallel processing requires a large number of transistors to be cost-effective. I'd rather have 10 times more transistors, 20% slower speed Translating the content: Now, we are also investing a lot of effort in constantly reinventing new algorithms so that when the time comes, the Hopper architecture will be two, three, or four times better than when they were purchased, so that this infrastructure will continue to be truly effective. Therefore, all the work we do is to improve new algorithms and frameworks. Please note that it contributes to every installation base we have. Hopper is more suitable for it, Ampere is more suitable for it, and even Volta is more suitable for it.

I just heard from Sam Altman that they recently decommissioned the OpenAI Volta infrastructure. So I think we are leaving a trace of this installation base, just as all computing installation bases are important. NVIDIA is involved in every cloud, including local and edge.

The VILA visual language models are created in the cloud and can run perfectly on robot edges without modification. They are all highly compatible. Therefore, I think architectural compatibility is very important for large devices, as well as for iPhones and other devices. I think the installation base is very important for reasoning.

Jensen Huang:

But what really benefits me is that we are working hard to train these large language models in the new architecture. We can think about how to create architectures that perform well in reasoning when the time is right. So, we have been thinking about iterative models of reasoning models and how to create a very interactive reasoning experience for this, right, your personal agent. You don't want to leave and think for a while after finishing speaking. You want to interact with you quickly. So how do we create something like this?

MVLink so we can adopt these systems that are very suitable for training. But when you finish it, the reasoning performance will be excellent. So you want to optimize the time to the first token. And achieving the time to the first token is actually very difficult because it requires a lot of bandwidth. But if your context is also rich, then you need a lot of FLOPS. Therefore, you need an unlimited amount of bandwidth and an unlimited amount of FLOPS to achieve a response time of a few milliseconds. So this architecture is really difficult to achieve. We have invented the great Blackwell MVLink for this.

Brad Gerstner:

Earlier this week, I had dinner with Andy Jassy (President and CEO of Amazon), and Andy said we have Tranium and Inferentia coming soon. I think most people once again see these as NVIDIA's issues. But next, he said NVIDIA is our important partner and will continue to be our important partner. As far as I can see, the future world will rely on NVIDIA.

So when you think about the custom ASICs being built, they will be used for targeted applications. Maybe it's Meta's inference accelerator, maybe it's Amazon's training, or Google's TPU. Then you think about the supply shortages you are facing today, will these factors change this dynamic? Or will they complement the systems they purchase from you? Jensen Huang:

We are just doing different things. Yes, we are trying to accomplish different things. Now NVIDIA is trying to build a computing platform for this new world, this machine learning world, this generative AI world, this agent-based AI world. We are trying to create, in the field of computing, such a profound point that, after 60 years of development, we have reinvented the entire computing stack. From programming to machine learning, from CPU to GPU, from software to AI, applications from software to AI. From software tools to AI. So, every aspect of the computing stack and technology stack has changed.

What we want to do is create a ubiquitous computing platform. This is indeed the complexity of our work, the complexity of what we are doing, because if you think carefully about what we are doing, you will find that we are building the entire AI infrastructure, we see it as a computer. I have said before, the data center is now the unit of computation. For me, when I think of a computer, I am not thinking about chips. I am thinking about this thing. This is my mental model, with all the software, all the orchestration, all the machines inside, it is my mission. This is my computer.

Every year, we try to build a new one. Yes, this is crazy. No one has ever done this before. Every year, we try to build a completely new one. Every year, we provide two to three times the performance. So, every year, we reduce costs by two to three times. Every year, we improve energy efficiency by two to three times. So, we ask customers not to buy everything at once, but to buy a little each year, right? Okay. The reason for this is that we want their costs to remain average in the future. Now, everything is compatible in terms of architecture, so it is very difficult to build these things separately at our current speed.

Now, the double difficulty is that we accept all of this, instead of selling it as infrastructure or services, we disagree with all of this. We integrate it into GCP, AWS, Azure, and X. So everyone's integration is different. We have to integrate all our architecture libraries, all algorithms, and all frameworks into their frameworks. We integrate our security systems into their systems, we integrate our network into their systems, right? Then we basically do 10 integrations, and now we do this every year. This is a miracle.

Brad Gerstner:

We, I mean, you try to do this every year, this is crazy. So what drives you to do this every year?

Jensen Huang:

Yes, that's when you systematically break it down. The more you break it down, the more surprised everyone is. Yes. How the entire electronic ecosystem today can be dedicated to working with us, ultimately building a computer cube integrated into all these different ecosystems, and coordinating so seamlessly. So obviously what we propagate backward is APIs, methods, business processes, and design rules, while what we propagate forward is methods, architectures, and APIs Brad Gerstner:

That's just how they are.

Jensen Huang:

For decades, they have been working hard. Yes, and they are constantly evolving along with our development. However, these APs must be integrated together.

Clark Tang:

Some people just need to call the OpenAI API, and it works. That's it.

Jensen Huang:

Yes. Yes, it's a bit crazy. This is a whole. This is what we invented, this massive computing infrastructure that the entire planet is collaborating with us on. It integrates everywhere. You can sell it through Dell, you can sell it through HP. It's hosted in the cloud. It's everywhere, everywhere. People are now using it in robotic systems, robots and human robots, in autonomous driving cars. They are compatible in architecture. Quite crazy.

Brad Gerstner:

This is too crazy.

Jensen Huang:

I don't want you to leave with the impression that I didn't answer the question. In fact, I did. When we truly layer the foundation, what I mean is the way of thinking. We are just doing something different. Yes, as a company, we want to understand the situation, and I am very aware of everything around the company and the ecosystem, right?

I know everyone is doing other things, what they are doing. Sometimes this is disadvantageous to us, sometimes not. I am very clear about this, but it doesn't change the company's goal. Yes, the company's sole goal is to build a ubiquitous platform architecture. That's our goal.

We won't try to take any share from anyone. NVIDIA is a market maker, not a share taker. If you look at the slides that our company doesn't show, you will find that this company never talks about market share, not internally either. What we talk about is how we create the next thing?

What is the next problem we can solve in this flywheel? How can we better serve people? How can we shorten the flywheel that used to take about a year to about a month? Yes. How fast is that speed? Isn't it?

So we are considering all these different things, but one thing we won't, we won't, we are not masters of everything, but we are sure that our mission is very unique. The only question is whether this mission is necessary. Does it make sense? Should all companies, all great companies, make this their core? It's about what you are doing?

Of course. The only question is, is it necessary? Is it valuable? Yes. Is it impactful? Is it helpful to people? I'm sure you are a developer, you are a generative AI startup, and you are about to decide how to become a company.

One choice you don't have to make is which A6 I support? If you only support CUDA, you can go anywhere. You can always change your mind later. But we are the gateway to the AI world, aren't we?

Once you decide to join our platform, you can postpone all other decisions. You can always build your own foundation later. We don't oppose that. We won't be angry about it. When I work with all GCP, GCP Azure, we show them our roadmap years in advance They did not show us their basic roadmap, and this has never offended us. Does this make sense? We create, we are in one. If you have a unique goal, your goal is meaningful, your mission is precious to you and others, then you can be transparent. Please note that my roadmap is transparent on GTC. Our roadmap is more in-depth for our friends at Azure, AWS, and other companies. We have no problem doing any of these things, even if they are building their own assets.

Brad Gerstner:

I think when people look at the business, you recently said that the demand for Blackwell is crazy. You said one of the most difficult parts of the job is to emotionally tell people "no" when the world lacks the computing power you can produce and provide. But critics have said these things. Wait a moment. They say this is like Cisco in 2000, where we overbuilt fiber optics. This will be a cycle of prosperity and recession. I remember the dinner we had in early 23. At that dinner in January 23, NVIDIA's forecast was that revenue would reach $26 billion in 2023. You achieved $60 billion.

Jensen Huang:

Let's just face the facts. This is the biggest forecasting failure in world history. Yes. At least we can admit that.

GPUs are playing an increasingly important role in AI computing

Brad Gerstner:

That's right, we were very excited on November 22 because we had people like Mustafa from Inflection, without anyone from Character coming to our office to talk about investing in their company. They said, well, if you can't invest in our company, then buy NVIDIA, because everyone in the world is trying to get NVIDIA chips to build these world-changing applications. Of course, the Cambrian moment happened on ChatGPT. Nevertheless, these 25 analysts are still focused on crypto winners to the point where they can't imagine what's happening in the world. So the scale eventually got bigger. In very plain English, the demand for Blackwell is crazy, and as long as you can foresee it, it will continue. Of course, the future is unknown and unknowable. But why did critics get it so wrong, thinking it wouldn't overbuild like Cisco did in 2000?

Jensen Huang:

The best way to think about the future is from first principles, right? Okay, so, for the question, what is the first principle of what we are doing? First, what are we doing? The first thing we are doing is reinventing computing, isn't it? We just said, the future of computing will be highly machine learning. Yes, highly machine learning. Okay, almost everything we do, almost every application, Word, Excel, PowerPoint, Photoshop, Premier, AutoCAD, your favorite applications are all hand-designed. I assure you, in the future, it will be highly machine learning. Right? So all these tools will be like that, most importantly, you will have agents, machines to help you use them Okay. So now we know this is a fact. Right? We have reinvented computing. We won't look back. The entire computing technology stack is being reinvented. Okay. Now that we have done this, we said software will be different. What software can write will be different. The way we use software will be different. So let's acknowledge this now. So these are my basic facts now. Yes.

The question now is what will happen? Let's review the past home computing. The past computers had invested $1 trillion. We look at it, just open the door, look at the data center, look at it. Are these the computers you want for the future? The answer is no. Do you have all these CPUs there. We know what it can do, what it can't do. We only know that we have $1 trillion worth of data centers that need modernizing. So now, as we speak, if we are to modernize these old things in the next four to five years. This is not unreasonable.

So we have a trend, you are talking to those who must modernize it. Yes, they are modernizing it on GPUs. That's it.

Let's do the test again. You have $50 billion in capital expenditure. Do you like to spend on option A, option B, to build capital expenditure for the future, right?

Or build capital expenditure as in the past, now that you already have the past capital expenditure, right? Yes, right. It's there. Anyway, it hasn't improved much. Moore's Law has basically ended. So why rebuild it?

We just take out $50 billion, invest in generative AI, right? So now your company is getting better. Right? How much of this $50 billion will you invest now? Well, I will invest 100% of the $50 billion because I already have a four-year infrastructure, which is the past.

So now you are just, I'm just reasoning from someone's first principles perspective, that's what they are doing. Smart people are doing smart things. Now, the second part is this. So we have a trillion-dollar worth of capacity. Go ahead, Bill.

A trillion-dollar worth of infrastructure. It's about $150 billion. Okay. So we have $1 trillion worth of infrastructure that needs to be built in the next four to five years. Okay, the second thing we observe is that the way software is written is different, but the way software is used is also different.

In the future, we will have agents. Our company will have digital employees. In your inbox, you will see these small dots on the low faces. In the future, things mean low icons of AIS. Right? I will send these to them.

I no longer program computers with C++, I program AI with prompts. Right? Now, this is no different from chatting with me this morning.

I wrote a lot of emails before coming here. Of course, I'm prompting my team. I will describe the background, describe the basic limitations I know, describe their tasks. I will leave enough space, I will give enough direction, so they understand what I need. I want to make it as clear as possible what the results should be, but I leave enough room for ambiguity, a little creative space, so they can surprise me Right? This is no different from me prompting AI today. Yes, this is exactly how I propose AI. So, on top of the modern infrastructure, there will be a new infrastructure. This new infrastructure will be the AI factories that operate these digital people, running around the clock.

We will provide these facilities to all companies around the world. We will have them in the factories, we will have them in autonomous systems. Right? So there is a whole layer of computing structure. This whole layer I call the AI factory, the world must manufacture, but it simply does not exist today.

So the question is, how big is this. Currently unknown. It could be tens of trillions of dollars. I know the current situation, but the beauty of it when we sit here building is that the modern architecture of this new data center is the same as the architecture of the AI factory. That's a good thing.

Brad Gerstner:

Can you make it clear that you have a trillion old things. You have to modernize. You have at least a trillion new AI workloads coming. Yes, your revenue this year will reach $125 billion. Someone once told you that the market value of this company will never exceed $1 billion. When you sit here today, is there any reason? Yes, if you only have $125 billion in a TAM of tens of trillions, then your future revenue will not be two or three times what it is now. Is there a reason why your revenue hasn't grown? No.

Jensen Huang:

As you know, not everything is like that, companies are only limited by the size of the fish pond, a goldfish pond can only be so big. So the question is, what is our fish pond? What is our pond? It takes a lot of imagination, which is why market makers consider the future without creating new fish ponds. It is difficult to look back and try to grab market share. Yes. Share gainers can only be so big. Of course. Market makers can be very big. Of course.

So, I think the good fortune our company has is that from the very beginning of the company, we had to create a market to swim in. People didn't realize it at the time, but now people do, and we are at the starting point of creating the 3D gaming PC market. We basically invented this market, as well as all the ecosystems and the graphics card ecosystem, we invented all of this. So, it is comfortable for us to invent a new market to serve it in the future.

Jensen Huang: I am happy for the success of OpenAI

Brad Gerstner:

As we all know, OpenAI raised $6.5 billion this week at a valuation of $150 billion. We all participated.

Jensen Huang:

Yes, I am really happy for them, really happy that they came together. Yes, they did a great thing, the team did a great job.

Brad Gerstner:

Reportedly, their revenue or operating income this year will reach around $5 billion, and next year it could reach $10 billion. If you look at today's business, its revenue is about twice that of Google's IPO. They have 250 million, yes, an average of 250 million weekly users, we estimate this is twice that of Google's IPO If you look at the price-to-earnings ratio of this company, if you believe it will have $10 billion in revenue next year, then it is about 15 times the expected revenue, which is the same as the price-to-earnings ratio of Google and Meta at their initial public offerings. Imagine a company that had zero revenue and zero average weekly users 22 months ago.

Let's talk about the importance of OpenAI as a partner to you, and the power of OpenAI in driving public awareness and use of AI.

Jensen Huang:

Well, this is one of the most important companies of our time, a pure AI company pursuing the AGI vision. Whatever its definition is. I hardly think the definition matters at all, and I don't think timing matters much either. One thing I know is that AI will have a capability roadmap over time. And that capability roadmap will be very grand, very peculiar. In this process, well before it reaches anyone's definition of AGI, we will be leveraging it fully.

All you have to do is, right now, as we speak, go talk to digital biologists, climate tech researchers, material researchers, physicists, astrophysicists, quantum chemists. You can go talk to video game designers, manufacturing engineers, robotics experts. Pick your favorite. Whatever industry you choose, you have to dive deep, talk to important people, ask them if AI has fundamentally changed the way they work. Collect those data points, and then ask yourself how skeptical you want to be. Because they are not talking about the conceptual advantages of AI. They are talking about using AI in the future. Now, agricultural tech, material tech, climate tech, you pick your tech, you pick your science. They are progressing. AI is helping them advance their work.

Now, as we said, every industry, every company, every altitude, every university. Unbelievable. Right? Absolutely. It will change business in some way. We know that. I mean, we know it's so real.

Today. It's happening. It's happening. So, I think the awakening of ChatGPT triggered it, which is absolutely incredible. I like their pace and their unique goal of driving the field forward, it's really important.

Brad Gerstner:

They have built an economic engine that can fund the next frontier of models. I think there is a consensus forming in Silicon Valley that the entire model layer, the commoditized Llama, allows many people to build models at very low cost. So early on we had a lot of model companies. These, features, tones, and cohesion are all listed.

Many people question whether these companies can build escape velocity on the economic engine to continue funding the next generation. My own feeling is that this is why you see consolidation. OpenAI clearly has the velocity. They can fund their own future. I'm not sure many other companies can do that. Is this a fair assessment of the current state of the model layer? We will, as in many other markets, consolidate this onto market leaders who can afford it, who have the economic engine and applications to allow them to continue investing

Having a powerful GPU alone does not guarantee a company's success in the field of AI

Jensen Huang:

First of all, there is a fundamental difference between models and AI. Yes. Models are essential elements. Yes. For AI, it is necessary but not sufficient. Yes. So, AI is an ability, but what is it used for, right? So what is its application? Right? The AI for software-driven cars is related to human robot AI, but not the same, the latter is related to chatbot AI, but not the same.

So you must understand the taxonomy. Yes, the taxonomy of the stack. At each level of the stack, there will be opportunities, but not every level of the stack provides unlimited opportunities for everyone.

Now, I just said a sentence, and what you did was replace the word "model" with GPU. In fact, this was a great observation of our company 32 years ago, that there is a fundamental difference between GPUs, graphics chips, or GPUs and accelerated computing. Accelerated computing is different from what we do in AI infrastructure. They are related but not entirely the same. They are layered on top of each other. They are not entirely the same. And each of these abstraction layers requires completely different skills.

People who are truly good at building GPUs do not know how to become an accelerated computing company. I can give you an example, there are many people making GPUs. I don't know which one is later, we invented the GPU, but you know we are not, we are not the only company making GPUs today, right? GPUs are everywhere, but they are not accelerated computing companies. Many people do this. Their accelerators can accelerate applications, but this is different from an accelerated computing company. For example, a very specialized AI application, right, this could be a very successful thing, right?

Brad Gerstner:

That's MTIA (Mata's next-generation AI accelerator chip).

Jensen Huang:

Yes. But it may not be the kind of company that brings impact and capability. So you have to decide what kind of person you want to be. All these different fields may have opportunities. But just like building a company, you have to pay attention to changes in the ecosystem and what will be commoditized over time, recognize what is a feature, what is a product, yes, what is a company. Okay. I just said, well, you can think about this issue in many different ways.

xAI and the Memphis supercomputer cluster have entered the era of "200,000 to 300,000 GPU clusters"

Brad Gerstner:

Of course, there is a newcomer who is rich, wise, and ambitious. That is xAI. Yes, right. And there are reports that you had dinner with Larry Ellison (Oracle's founder) and Musk. They convinced you to give up 100,000 H100 chips. They went to Memphis and built a large coherent supercluster in a few months.

Jensen Huang:

Three points, don't equate them, okay? Yes, I had dinner with them Brad Gerstner:

Do you think they have the ability to build this supercluster? There are rumors that they want another 100,000 H200, right, to expand the scale of this supercluster. First, talk to us about X and their ambitions and achievements, but at the same time, have we entered the era of 200,000 to 300,000 GPU clusters?

Jensen Huang:

The answer is yes. First of all, acknowledge the achievements. From the moment of conception to the data center being ready for NVIDIA to install our equipment there, to the moment we start it up, connect it, and do the first training, everything is worth it.

Jensen Huang:

Okay. So the first part is to build a huge factory in such a short time, water-cooled, powered up, licensed, I mean, it's like Superman. Yes, as far as I know, there is only one person in the world who can do this. I mean, Musk's understanding of engineering and construction of large systems and resource allocation is unique. Yes, it's truly incredible. Of course, his engineering team is also excellent. I mean, the software team is great, the network team is great, the infrastructure team is great. Musk has a deep understanding of this.

From the moment we started planning with the engineering team, network team, infrastructure computing team, and software team, all the preparations were made in advance. Then all the infrastructure, all the logistics, the amount of technology and equipment that arrived on the same day, the video infrastructure and computing infrastructure, and all the technology needed for training, everything was up in the air for 19 days, what do you want? Done.

Step back and think, do you know how many days 19 days are? Is 19 days a few weeks? Right? If you see it with your own eyes, the amount of technology is incredible. All the wiring and networks, the network of NVIDIA equipment is very different from the network of ultra-large-scale data centers. Okay, how many wires does a node need. The back of the computer is full of wires, and it is incredible to integrate this pile of technology and all the software together.

So I am very grateful for what Musk and the X team have done, and I appreciate his recognition of the engineering work and planning work we have done together. But their achievements are unique, unprecedented. Just from this perspective. 100,000 GPUs, as a cluster, could easily become the fastest supercomputer on Earth. Typically, building a supercomputer requires three years of planning. Then they deliver the equipment, and it takes a year to get them all up and running. Yes, we're talking about 19 days.

Clark Tang:

What credit does NVIDIA deserve?

Jensen Huang:

Everything is running smoothly. Yes, of course, there are a lot of X algorithms, X frameworks, X stacks, and so on. We say we have a lot of reverse integration to do, but the planning is excellent. Just pre-planning.

Large-scale distributed computing is an important direction for the future development of AI

Brad Gerstner: One end is correct. Musk is one end. Yes, you, but when you answered this question, you said at the beginning, yes, there are 200,000 to 300,000 GPU clusters here. Yes, right. Can it be expanded to 500,000? Can it be expanded to 1 million? Does your product demand depend on it expanding to 2 million?

Jensen Huang:

The last part is negative. My feeling is that distributed training must be effective. My feeling is that distributed computing will be invented. Some form of federated learning and distributed computing, asynchronous distributed computing will be discovered.

I am very enthusiastic and optimistic about this, of course, to be aware that the scaling law used to be about pre-training. Now we have shifted to multimodal, we have shifted to synthetic data generation, post-training has now expanded incredibly. Synthetic data generation, reward systems, based on reinforcement learning, and now inference scaling has reached its peak. A model has undergone an incredible 10,000 internal inferences before answering your question.

This may not be unreasonable. It may have completed tree search. It may have done reinforcement learning on this basis. It may have done some simulation, definitely done a lot of reflection, may have looked up some data, checked some information, right? So his background may be quite extensive. I mean, this type of intelligence is. Well, that's what we do. That's what we do. Right? So, for the ability, this expansion, I just did the calculation and compounded it with model size and computing size quadrupling every year.

On the other hand, demand continues to grow in usage. Do we think we need millions of GPUs? Without a doubt. Yes, now this is certain. So the question is, how do we build it from the perspective of the data center? This largely depends on whether the data center is a few thousand megawatts at a time or 250 megawatts at a time. My feeling is, you will get both at the same time.

Clark Tang:

I think analysts always focus on current architectural bets, but I think one of the biggest takeaways from this conversation is that you are considering the entire ecosystem and many years into the future. So, because NVIDIA is just expanding or scaling up to meet future needs. This is not to say that you can only rely on a world with 500,000 or even a million GPU clusters. When distributed training emerges, you will write software to implement it.

Jensen Huang:

We developed Megatron seven years ago. Yes, the scaling of these large training tasks will happen. So we invented Megatron, so all the ongoing model parallelism, all the breakthroughs in distributed training, all the batch processing, and all these things are because we did early work, and now we are doing early work for the next generation.

AI is changing the way we work

Brad Gerstner:

So let's talk about strawberries and o1. I think it's cool that they are named o1. This means recruiting the best and brightest people in the world and bringing them to the United States. I know we are all passionate about this. So I like the idea of building a thinking model that takes us to the next level of expanding intelligence, right, it's a tribute to the fact that it is these people who come to the United States through immigration, Jensen Huang:

Of course. And extraterrestrial intelligence.

Brad Gerstner:

Of course. This is led by our friend Noam Brown. The importance of reasoning time as a new carrier for expanding intelligence is separate from just building larger models.

Jensen Huang:

This is a big deal. This is a big deal. I think a lot of intelligence cannot be completed a priori. Yes. Many computations, and even many computations cannot be reordered. What I mean is, unordered execution can be prioritized, and many things can only be completed at runtime.

So, whether you are thinking from a computer science perspective or an intelligence perspective, too many things require context. Environment, right? And quality, the type of answer you are looking for. Sometimes, a quick answer is enough. It depends on the consequences and impact of the answer. It depends on the nature of the answer's use. So, for some answers, take an evening, and for some answers, take a week.

Yes. Right? So I can totally imagine sending a prompt to my AI, telling it to consider overnight. Don't tell me right away. I want you to think all night and then tell me tomorrow. What is your best answer and reason for me. So, I think from a product perspective, the current quality, the segmentation of intelligence. There will be one-time versions. Of course. And some will take five minutes.

Right? And humans. So if you are willing, we will become a huge workforce. Some of them are digital people in AI, some are biological people, and I hope some are even super robots.

Brad Gerstner:

I think, from a business perspective, this is a severely misunderstood thing. You just described a company whose output is equivalent to a company with 150,000 people, but you achieved it with only 50,000 people. That's right. Now, you didn't say I'm going to fire all employees. No. You are still increasing the number of employees in the organization, but the output of the organization will increase significantly.

Jensen Huang:

This, this is often misunderstood. AI is not me. AI will not change every job. AI will have a huge impact on how people work. Let's admit that. AI has the potential to bring incredible benefits. It also has the potential to cause harm. We must build safe AI. Yes, let's lay this foundation. Yes. Okay.

Jensen Huang:

What people overlook is that when companies use AI to increase productivity, it is likely to manifest as better profits or better growth, or both. When this happens, the CEO's next email is likely not about layoffs.

Brad Gerstner:

Of course it's an announcement, because you're growing.

Jensen Huang:

The reason is we have more ideas, we can explore, we need people to help us carefully consider before automation. So the automation part, AI can help us with. Obviously, it will also help us think, but we still need to figure out what problem I want to solve. There are tens of trillions of problems we can solve So, what problems does the company need to solve, choose these ideas, and find ways to automate and scale. Therefore, as we become more productive, we will hire more people. People forget this point, if you go back in time, obviously our ideas today are more than 200 years ago. That's why GDP is larger and there are more jobs. Even though we are frantically automating at the grassroots level.

Brad Gerstner:

This is a very important point of this period, we are entering a time when almost all human productivity, almost all human prosperity is a byproduct of automation. The technology of the past 200 years. I mean, you can look at the creative destruction of Adam Smith and Joseph Schumpeter, you can look at the chart of per capita GDP growth over the past 200 years, and now it is accelerating.

Yes, this reminds me of this issue. If you look at the 1990s, our U.S. productivity growth rate was about 2.5% to 3% per year, right? Then in 2010, it slowed to about 1.8%. And the past 10 years have been the slowest decade of productivity growth. So this is the slowest in recorded history in terms of fixed quantities of labor and capital or output.

Many people are debating the reasons for this. But if the world is really as you describe, that we are harnessing and creating intelligence, then are we on the edge of a sharp expansion of human productivity?

Jensen Huang:

This is our hope. This is our hope. Of course, we live in this world, so we have direct evidence.

We have direct evidence, either isolated cases or individual researchers who are able to explore science on an unimaginably large scale using AI. That is productivity. One hundred percent measure of productivity, or we are designing chips at such an incredible rate. The complexity of the chips we are building and the complexity of the computers are growing exponentially, and the company's employee base is not the standard for measuring productivity, right.

The software we develop is getting better and better because we use AI and supercomputers to help us. The number of employees is almost linearly increasing. Another manifestation of productivity.

So, I can delve into it, I can sample many different industries. I can check it out in person. Yes, you're right. Business. Exactly.

So I can, of course, you can't, we can't, we might overfit. But the art of it is certainly to summarize what we observe and whether this will be reflected in other industries.

Undoubtedly, AI is the most valuable commodity known in the world. Now we have to mass produce it. We, we, all of us have to be good at it, what will happen if you are surrounded by these AIs, they do very well, much better than you. When I think back, this is my life. I have 60 direct subordinates.

They are world-class in their own fields, and they are much better than me. Much better than me. I interact with them effortlessly, and I can design them effortlessly. I can program them effortlessly. So I think what people need to learn is that they will all become CEOs They will all become the CEOs of AI agents. They have the ability to possess creativity, well, some knowledge, and how to reason, how to break down problems, so you can program these AIs to help you achieve goals like mine. That's running a company.

Multi-party Efforts Needed for AI Security

Brad Gerstner:

Now. You mentioned something, that is, the lack of coordination, secure AI. You mentioned the tragedy happening in the Middle East. We have a lot of autonomy, and many AIs are being used all over the world. So let's talk about bad actors, secure AI, and coordination with Washington. How do you feel today? Are we on the right track? Do we have enough level of coordination? I think Mark Zuckerberg once said that the way to defeat bad AI is to make good AI better. How do you describe your view on how we ensure this has a positive net benefit for humanity, rather than living in this dystopian world?

Jensen Huang:

The discussion about security is indeed very important and good. Yes, abstract views, conceptual views of AI as a huge neural network, are not that good, right. Good. The reason is, as we all know, AI and large language models are related but not the same thing. I think there are many things being done very well. First, open-source models so that the entire research community, every industry, and every company can participate in AI, yes, and learn how to leverage this capability for applications. Very good.

Second, people underestimate the number of technologies dedicated to inventing AI to ensure AI security. Yes, AI can organize data, carry information, train, create AI to coordinate AI, generate synthetic data to expand AI's knowledge, making it less illusory. All created for vectorization or graphing or any other AI systems used to inform AI, protect AI to monitor other AI, these secure AI systems created are being praised, right?

Brad Gerstner:

So we have built it.

Jensen Huang:

That. We are building all of this. Yes, across the industry, methodologies, red teams, processes, model cards, evaluation systems, benchmarking systems, all of these, all of these are being built at an incredible pace. I wonder, celebrate. Do you understand? Yes, you know.

Brad Gerstner:

And, no, no, no government regulations saying you have to do this. Yes, today, participants building these AIs in this field are taking these critical issues seriously and coordinating around best practices. Exactly.

Jensen Huang:

So this has not been fully appreciated, nor fully understood. Yes. Someone needs to, everyone needs to start talking about AI, it's an AI system, it's an engineering system, it's carefully designed, built from first principles, thoroughly tested, and so on. Remember, AI is a capability that can be applied. I don't think it's necessary to regulate important technologies, but also not to over-regulate to the point where some regulations need to be targeted at the majority of applications All different ecosystems that have already regulated technological applications must now regulate the integration of AI technology.

So, I believe, do not misunderstand, do not ignore the plethora of regulations that must be initiated worldwide for AI. Do not rely solely on a universal galaxy. The AI committee may be able to do this, as the establishment of all these different institutions is for a reason. The establishment of all these different regulatory bodies is for a reason. Going back to the original principle, I will.

The opposition between open source and closed source is wrong

Brad Gerstner:

You have launched a very important, very large, very powerful open-source model.

Jensen Huang:

Nemotron.

Brad Gerstner:

Yes, it is clear that Meta has made significant contributions to open source. I find that when I read Twitter, there are many discussions about open and closed. How do you view open source, your own open-source model, can it keep up with the cutting edge? That's the first question. The second question is, you know, having open-source models and closed-source models, do they provide a healthy tension for commercial operations, is this your view of the future? Do these two things create a healthy tension for security?

Jensen Huang:

Open source and closed source are related to security, but not only security. For example, having closed-source models is absolutely not wrong, they are the engines of economic models necessary to sustain innovation. Well, I completely agree with this. I think the opposition between closed and open is wrong.

Because openness is a necessary condition for many industries to be activated, now, if we don't have open source, how can all these different scientific fields be activated, activate AI. Because they must develop AI in their specific fields, they must use open-source models to develop their AI, create AI in specific fields. They are related, not, I repeat, not the same. Just because you have an open-source model doesn't mean you have AI. So you must have that open-source model to create AI. So the list of industries and scientific fields such as financial services, healthcare, transportation, have now been realized because of open source.

Brad Gerstner:

Incredible. Do you see a great demand for your open-source model?

Jensen Huang:

Our open-source model? First. Llama download. Obviously, yes, Mark and the work they have done is incredible. Beyond imagination. Yes. It completely activates and attracts every industry, every scientific field.

Okay, of course. The reason we do Nemotron is to generate synthetic data. Intuitively, an AI will sit there in some way looping and generating data to learn itself. This sounds fragile. How many times can you go around this infinite loop, this loop is questionable. However, the image in my mind is a bit like you locking up a super smart person in a padded room, closing the door for about a month, and what comes out may not be a smarter person. So, so, but you can have two or three people sitting together, we have different AIs, we have different knowledge distributions, we can do quality assurance back and forth. The three of us can all become smarter Therefore, you can let AI models exchange, interact, pass back and forth, discuss reinforcement learning, synthetic data generation, etc., this idea is intuitively meaningful, can make suggestions and meaningful. Therefore, our model Nemotron 350B is the best reward system model in the world. Therefore, this is the best criticism.

Interesting. This is a great model that can enhance other models. So, no matter how good other people's models are, I would recommend using Nemotron 340B to enhance and improve them. We have seen Llama become better, making all other models better.

Brad Gerstner:

As the person who delivered the DGX1 in 2016, this is truly an incredible journey. Your journey is both incredible and unbelievable. It's amazing to have just survived in the early days. You delivered the first DGX1 in 2016. We are ushering in the Cambrian moment in 2022.

So I have a question for you that I often want an answer to, which is, under the leadership of 60 direct subordinates, how long can you maintain your current job? You are everywhere. You are driving this revolution. Are you having fun? Is there anything else you would rather be doing?

Jensen Huang:

That's a question about the past hour and a half. The answer is "I'm enjoying it." Great times. I can't imagine doing anything else that I would rather be doing. Let's see. I think, I don't think we should leave the impression that our work is always fun. My work is not always fun, and I don't expect it to always be fun. Did I ever expect it to always be fun? I think it's always important.

Yes, I don't take myself too seriously. I take my work very seriously. I take our responsibilities very seriously. I take our contributions and our moments very seriously.

Is it always fun? No. But have I always enjoyed it? Yes. Like everything, whether it's family, friends, or children. Is it always fun? No. Do we always enjoy it? Absolutely.

So I think, how long can I do it? The real question is, how long can I stay relevant? That's the most important thing, and the answer to this question can only be how will I continue to learn? Today I am more optimistic. I say this not only because of our topic today. I am more optimistic about my relevance and ability to continue learning because of AI. I use it every day, I don't know, but I believe you all do too. I use it almost every day.

I don't have a study that doesn't involve AI. Yes, not a single question, even if I know the answer, I will double-check with AI. Yes, surprisingly, the two or three questions I ask next reveal some things I didn't know. You choose your topic. You choose your topic. I think AI is a mentor.

AI is an assistant, AI is a partner, can brainstorm with me, check my work, guys, this is completely revolutionary. I am an information worker. I output information. So I think their contribution to society is remarkable So I think, if that's the case, if I can maintain this relevance, continue to contribute, I know this job is important enough, yes, I want to pursue it, my quality of life is incredible. So I will.

Brad Gerstner:

I can't imagine you and I have been working in this field for decades, I can't imagine missing this moment. This is the most important moment in our careers. We are very grateful for this collaboration.

Jensen Huang:

Looking forward to the next ten years.

Brad Gerstner:

A partnership of minds. Yes, you make things smarter. Thank you. I think it's really important that you are part of the leadership, right, which will optimistically and safely lead everything forward. So thank you.

Jensen Huang:

Being with you all. Really happy. Really. Thank you