In addition to the important GTC conference speech, a summary of Jensen Huang's key points from his speech to Wall Street and the internet this week

Jensen Huang dropped a bombshell at GTC26: NVIDIA's order visibility has surpassed $1 trillion, and growth is still accelerating. He asserted that AI has entered the third inflection point—the era of intelligent agents, where "each engineer will manage 100 intelligent agents"; Tokens will become a new type of salary, and engineers who do not consume Tokens will be equivalent to wasted productivity. A $50 trillion physical AI blue ocean is waiting to be ignited

During the recently concluded GTC26, NVIDIA CEO Jensen Huang showcased the grand blueprint of this peak market value company through an in-depth interview and financial Q&A. From "visibility of $1 trillion in orders" to "agents are the future personal computers," Huang is not only selling chips but also reshaping the distribution logic of the global IT industry.

Wall Street Insights has compiled the key quotes as follows:

On Tokens and Value: If one of your engineers with a $500,000 annual salary only spends $5,000 a year on tokens, I would go crazy. If that $500,000 engineer does not consume at least $250,000 worth of tokens, I would be deeply uneasy. Even if the chips are free, if they cannot keep up with the technological state and the speed at which we operate, they are still not cheap enough.

On Agents and the Future: Every engineer will have 100 agents. In the past, we wrote code; in the future, we will write ideas, architectures, and specifications. Agent systems are systems that complete work, and they are helping our software engineers get their jobs done.

On Token Economics: Computers were just tools in the past; future computers are manufacturing devices. People buy these computers to produce tokens, and the effectiveness of producing these tokens is crucial. You are simultaneously buying the most expensive computers and producing the lowest-cost tokens.

On Market Demand and Growth: We have over $1 trillion in strong visibility for the demand for Blackwell and Reuben, orders and demand. Our growth rate is actually accelerating. Every software company, every company needs to have an OpenClaw strategy.

On Competition and Architecture: Anyone who says 'my chips are 30% cheaper' is just proving they don't understand AI. You just have to breathe air until you run out of breath. After that, we will breathe compressed liquid air, but before that, how is the air? It's free, and we've been using it for a long time." (Note: This refers to the limits of using copper cable technology.)

Financial Bombshell: $1 Trillion Order Wall

Jensen Huang dropped a shocking number during the analyst meeting: NVIDIA's order demand visibility for the Blackwell and Reuben architectures has exceeded $1 trillion.

The Logic Behind Growth: This number is not inflated but is based on confirmed purchase orders and factory construction pipelines. Huang emphasized that NVIDIA's advantage lies in its delivery cycle being much faster than companies developing their own ASIC chips, even achieving "orders placed and shipped in the same quarter."
Gross Margin Moat: In response to doubts about "value being siphoned off by NVIDIA," Huang bluntly stated, "TSMC's wafers are the most expensive in the world, but they have the highest value, so I'm willing to pay." He believes that customers are not buying expensive computers but rather the lowest-cost token productivity in the world.

The Third Inflection Point: From Large Models to "Agentic Systems"

Jensen Huang believes that AI has gone through two stages: generative and reasoning, and is now at the third inflection point—Agentic Systems.

Tokens Become the New Salary: In the future, companies hiring engineers will provide not just laptops but also a Token budget. If an engineer with a salary of $300,000 does not consume Tokens, they are wasting productivity.
The Birth of Personal AI Computers: Systems represented by the open-source project OpenClaw are defined by Huang as "the first personal artificial intelligence computer in human history." It has memory, scheduling, skills, and APIs, serving as the operating system for the future IT industry.

New Hardware Landscape: The "Marriage" of Vera Rubin and Groq

NVIDIA is no longer just a GPU company but an "AI factory" company.

Disaggregated Inference: This is the core of the Dynamo operating system. By breaking down inference tasks, chips with different performance levels can each play their role.
Groq's Role: NVIDIA's acquisition and integration of Groq (LPX series) is not to replace GPUs but to utilize its extremely low-latency SRAM architecture to handle the "final step" of autoregressive inference.
Trinity Architecture: NVIDIA is the only company in the world that can simultaneously optimize HBM (High Bandwidth Memory), LPDDR5, and SRAM, making its "liquid-cooled rack-mounted" complete delivery appear far superior to competitors' single-point chips.

Physical AI: A $50 Trillion Blue Ocean

Huang is particularly optimistic about Physical AI, believing its ultimate scale will surpass that of digital AI.

Reshaping Traditional Industries: This is a $50 trillion sector that has been a technological desert for the past 20 years. From robotic surgery and autonomous driving to smart base stations, Physical AI must adhere to the laws of physics at the edge.
Robot Explosion in 3-5 Years: Jensen Huang predicts that robots will be ubiquitous in the next 3 to 5 years. While China has significant advantages in hardware supply chains like motors and rare earths, NVIDIA will provide the brain (training, simulation, and three onboard computers).

Industry Symbiosis: NVIDIA is the "Best Salesperson" for Cloud Service Providers

In response to the threat of cloud giants developing their own chips, Huang appears extremely confident:

Traffic Drives Sales: "AWS, Google, and Microsoft have the largest booths at GTC because they want to sell services to my CUDA developers." NVIDIA directs developers to the cloud through the CUDA ecosystem, essentially serving as a customer acquisition engine for cloud service providers
Irreplaceability: 40% of the business comes from non-cloud giant sectors (regional clouds, enterprise private clouds), where these customers are buying a "full-stack platform" rather than "chips." Without NVIDIA's complete solutions, these markets would be completely unreachable.

Below are the full transcripts of two interviews, assisted by AI translation:

Interview on "All-In Podcast"

Host (Jason Calacanis):

This week is a special episode; we made an exception to postpone our regular show. Usually, we only make exceptions for three people: President Trump, Jesus, and Jensen Huang. You can rank them yourself. Jensen, your performance this year has been incredible, and this event is fantastic; every industry, every tech company, and every AI company is here.

Host (Chamath Palihapitiya):

One of the most significant announcements last year was NVIDIA's acquisition of Groq. When you decided to buy Groq, did you realize how unbearable Chamath would become?

Jensen Huang:

I had a feeling (laughs). We are his friends, dealing with him every week, and I know how to handle him during those six weeks of closing.

Jensen Huang:

In fact, many of our strategies were publicly showcased at GTC (NVIDIA GPU Technology Conference) long ago. Two and a half years ago, I introduced the operating system for AI factories, called Dynamo. As you know, Dynamo was invented by Siemens to convert hydropower into electricity, driving the last industrial revolution. I think it's the perfect name for the "factory operating system" of the next industrial revolution.

Inside Dynamo, the core technology is Disaggregated Inference. Jason, I know you are technically strong; I'll let you explain it to the audience.

Host (Jason Calacanis):

Thank you for not directly taking the mic. Disaggregated inference means that the processing pipeline for inference is extremely complex; it is one of today's most complex computing problems. It is large-scale and involves mathematical operations of various shapes and sizes. Our idea is to break down the processing so that part runs on certain GPUs while the rest runs on different GPUs. This made us realize that even disaggregated computing makes sense.

This thinking led us to acquire Mellanox. Today, NVIDIA's computing is distributed across GPUs, CPUs, switches, and network processors. Now that we have added Groq, we will place the right workloads on the right chips. We have evolved from a GPU company into an AI factory company

Host (Chamath Palihapitiya):

You mentioned on stage that high-value reasoning users should pay attention to this. You said that 25% of data center space should be allocated to this combination of Groq LPU and GPU. Can you tell us how the industry views this new form of computing called "pre-fill-decode decoupling"?

Jensen Huang:

Stepping back, as we adopt this technology, we are shifting from "large language model processing" to "agentic processing." When you run an agent, you access working memory, long-term memory, use tools, and put immense pressure on storage. Agents also collaborate with each other.

Therefore, there are various types of models within the data center. We created the Vera Rubin architecture to handle these extremely diverse workloads. As a result, NVIDIA's total addressable market (TAM) has increased by 33% to 50%. A significant portion of this will be storage processors (BlueField), Groq processors, CPUs, and network processors. Together, these constitute the computers of the AI revolution, or "agents."

Host (David Friedberg):

What about embedded applications? For example, my daughter's teddy bear at home wants to chat with her. Will it have a custom chip inside, or will there be a broader set of tools developed for the edge side?

Jensen Huang:

On a large scale, we believe this issue involves three computers:

The first is for training AI models.

The second is for evaluation. For instance, robots and autonomous vehicles must be evaluated in a virtual laboratory (Omniverse) that adheres to the laws of physics.

The third is the robotic computer on the edge. It could be a car, a robot, or a small teddy bear.

We are also doing something very important: transforming telecom base stations into part of AI infrastructure. This is a $2 trillion industry, and in the future, radio base stations, factories, and warehouses will become extensions of AI.

Host (Brad Gerstner):

Jensen, last year you were ahead of the curve predicting that "inference will not just grow a thousandfold." Now is it going to grow a millionfold or a billionfold? At the time, people thought you were exaggerating because everyone was focused on training. But now the demand for inference has exploded.

Some say your inference factory costs as much as $40-50 billion, while custom chip (ASIC) solutions only cost $25-30 billion, and you might lose market share because of this. What do you think about that? Why would people pay double the premium?

Jensen Huang:

The core logic is: you should not equate "the price of a factory" with "the cost of a Token." I can prove that a $50 billion factory can generate the lowest cost Token for you.

Because our efficiency is 10 times higher. Out of the $50 billion budget, $20 billion is for land, electricity, and facilities, which must be spent regardless of the chips used. The remaining difference does not account for a large proportion of the overall cost. But if my data center's throughput is 10 times that of others, then even if their chips are free, they cannot compete with us.

Chamath Palihapitiya:

You manage the world's most valuable company, with revenue potentially exceeding $350 billion next year. How do you decide "what to do"? How do you gain intuition on where to double down and where to retreat?

Jensen Huang:

That is the job of a CEO: to define vision and strategy. I am inspired by the excellent scientists and technical experts in the company, but I must shape the future.

My standard is: is this thing outrageously difficult? If it’s easy, we should back off, because there will be many competitors for easy things. I am looking for things that have never been realized, are extremely difficult, and can leverage NVIDIA's "superpowers." We know this will bring pain and hardship, but without pain, there are no great inventions.

Brad Gerstner:

Can you talk about a few long-term businesses? For example, space data centers, autonomous driving, or digital biology?

Jensen Huang:

Physical AI is a huge category. This is a $50 trillion industry that has been almost a technological desert until now. We started laying the groundwork 10 years ago, and now it is at an inflection point, bringing us nearly $10 billion in revenue each year.

Digital biology is about to have its "ChatGPT moment." We are beginning to understand how to represent genes, proteins, and cells. The healthcare industry will undergo tremendous changes in the next five years. Agriculture will be the same.

Jason Calacanis:

NVIDIA initially grew from gamers and enthusiasts. You mentioned the revolution of "agents," especially the application of open-source agents on the desktop. What does this mean for you?

Jensen Huang:

There have been three turning points in the past two years:

Generative AI: ChatGPT made everyone aware of the existence of AI

Reasoning: Models like O1 allow AI not only to answer questions but also to reason.

Agents: Initially enterprise tools like Claude Code, while OpenClaw brings AI agents into the public eye.

The importance of OpenClaw lies in its redefinition of the computing model. It has memory, a file system, skills, resource scheduling, and APIs. This is essentially the first personal artificial intelligence computer in human history. It is open-source and can run anywhere.

Host (Brad Gerstner):

Does this paradigm shift render current AI regulatory legislation meaningless? How should politicians respond to this rapid change?

Jensen Huang:

We need to stand before decision-makers and inform them of the truth about the technology: it is not biological, not extraterrestrial, and has no consciousness; it is just computer software.

We cannot let "apocalyptic" narratives influence policy. As a nation, the greatest security risk is not AI itself, but when we stagnate out of fear while other countries adopt this technology. What I worry about is the speed of AI adoption in the United States.

Host (Brad Gerstner):

Regarding the previous controversy with the Department of Defense and Anthropic, if you were a board member, what advice would you give them to change the public's fear?

Jensen Huang:

Anthropic's technology is excellent, and we are also one of their major clients. Warning about the potential of technology is a good thing, but "warning" does not equal "intimidation."

As technology leaders, we must speak more cautiously and humbly. Spreading "catastrophic" rhetoric without any evidence can cause more harm than people realize. Now that technology is so crucial to social structure and national security, our words are vital.

Host (Brad Gerstner):

The support rate for AI among the American public is only 17%. We once destroyed the nuclear energy industry out of fear, and now China is building 100 nuclear reactors while we have zero.

Back to agents, how has productivity improved internally at NVIDIA? Everyone is discussing return on investment (ROI). Do you think we will see revenue grow exponentially like intelligence?

Jensen Huang:

Just look at the audience in this venue; 99% of things are AI. From generative to reasoning, the computational demand has increased by 100 times; from reasoning to agents, the computational demand has increased another 100 times. In two years, the computational demand has increased by 10,000 times

People will pay for "information," but are more willing to pay for "work." Chatbots are nice, but agents that can help me get work done are truly valuable. Agents are helping our engineers complete their tasks. We are not yet at the stage of true large-scale deployment; future growth will be on a million-fold scale.

Host (Jason Calacanis):

NVIDIA has 43,000 employees, of which 38,000 are engineers. What would you think if one of your engineers, earning $500,000 a year, only spent $5,000 a year on tokens?

Jensen Huang:

I would go crazy! If a $500,000-a-year engineer doesn't consume at least $250,000 worth of tokens in a year, I would be very worried.

It's like a chip designer saying he only needs paper and a pen, and doesn't need CAD tools. This is a paradigm shift, just like LeBron James spending $1 million a year on body maintenance. We need to give knowledge workers "superpowers."

Host (Jason Calacanis):

What will the productivity of these "all-star" employees look like in the next two to three years?

Jensen Huang:

The thoughts of "this is too hard" and "this takes too long" will no longer exist. Just like after the Industrial Revolution, no one would say "that building is too heavy." Gravity and scale will no longer be issues; what remains is creativity.

In the past, we wrote code; in the future, we will write "ideas," architectures, and specifications. Each engineer will manage 100 agents in the future.

Host (David Friedberg):

Last Sunday night, I spent 90 minutes using agents to replace an entire software architecture that would have originally required a lot of manpower. This sense of acceleration is unprecedented.

Jensen Huang:

That's why OpenClaw is so incredible. Some people think the enterprise software industry will be destroyed, but my view is the opposite: this industry has been limited by the "number of employees at the office." In the future, there will be 100 times as many agents tapping into SQL databases, Photoshop, and Blender. These tools are the last means for humans to control outcomes.

Host (Chamath Palihapitiya):

Regarding open source, some open-source models in China are very powerful. Do you think the endgame of AI is decentralized?

Jensen Huang:

We need both "model as product (private)" and "model as open source" to coexist. For most consumers, I don't want to fine-tune the model myself; I would use ChatGPT or Claude. But for vertical industries, they need to control their domain knowledge, which must rely on open-source models NVIDIA is also strongly supporting the open-source ecosystem, as this allows startups to have world-class vertical capabilities from day one.

Host (Brad Gerstner):

President Trump wants American industry to lead and hopes American AI will go global. Currently, NVIDIA's market share in certain markets (like China) has dropped from 95% to 0%. What is the situation now?

Jensen Huang:

President Trump wants us to return to the battlefield. We have applied for licenses for many Chinese companies requesting procurement and have received approvals. We are restarting the supply chain for shipments.

If the U.S. cannot dominate the AI technology stack (from chips to systems), it is a significant loss for national security. I hope the U.S. computing stack occupies 90% of the global market share.

Host (Jason Calacanis):

Are you worried about supply chain risks from global conflicts, the situation in Taiwan, and even helium supplies in the Middle East?

Jensen Huang:

In the Middle East, we have 6,000 families there, and we support them 100%, staying in Israel 100%. Regarding Taiwan, there are three things we need to do:

Achieve U.S. domestic industrialization as quickly as possible (such as the factory in Arizona).

Diversify the supply chain (South Korea, Japan, Europe).

Maintain patience and restraint.

Host (Jason Calacanis):

In terms of autonomous driving, your strategy is an open-source platform like Android, while Tesla is more like iOS. How do you view this chess game?

Jensen Huang:

Our goal is for every car company in the world to be able to manufacture autonomous vehicles. We provide three computers (training, simulation, onboard) and have developed the safest operating system. Even clients like Musk and Tesla, who have strong in-house development capabilities, will buy our training computers. We are happy to provide solutions at any level; we are not here to replace anyone but to solve problems.

Host (Brad Gerstner):

But big clients like Google and Amazon are also developing their own chips (TPU, Trainium). They are both customers and competitors; how do you respond?

Jensen Huang:

We are the only AI company in the world that collaborates with every AI company. I don't look at what they are developing, but I show them everything I am developing.

Confidence comes from two points:

Buying NVIDIA products is still the most economical choice at present

We are the only cross-platform architecture (cloud, local, in-vehicle, and even space).

Many people do not realize that 40% of NVIDIA's business comes from building complete AI infrastructure, not just selling chips. In fact, NVIDIA's market share is increasing because Anthropic and Meta are using NVIDIA, and the explosion of open-source models is also built on NVIDIA.

Host (Brad Gerstner):

Analysts seem to doubt your growth potential. They predict you will grow 30% next year, 20% the year after, and only 7% by 2029. They feel that the "law of large numbers" will limit you.

Jensen Huang:

They just do not understand the scale and breadth of AI. The CPU market for data centers used to be only $25 billion a year, and our current scale is completely different. What NVIDIA does is not just chips, but AI infrastructure, and this market is much larger than people think.

Host (Chamath Palihapitiya):

Talk about space data centers?

Jensen Huang:

We are already in space. The challenge is heat dissipation (which can only rely on radiation), but this is not insurmountable. Currently, our chips are installed on satellites to process images. Instead of sending data back to Earth, it is better to process it directly in space.

Host (Jason Calacanis):

How will AI make a real impact in the medical field?

Jensen Huang:

Three directions:

AI Biology: Predicting biological behavior to assist in drug development.

AI Assistants: Assisting doctors in diagnosis.

Physical AI/Robotic Surgery: Future ultrasound and CT machines will have built-in intelligent agents.

Host (Jason Calacanis):

Robotics has gone through a "lost twenty years," and now Musk's Optimus and China's robotics are developing rapidly. How far are we from "robot chefs" or "robot butlers"?

Jensen Huang:

About 3 to 5 years. China is very strong in the hardware ecosystem of motors, rare earths, magnets, etc., and the global robotics industry will rely on that supply chain. Ultimately, robots will become the greatest driving force for human prosperity. They can not only solve labor shortages but also allow everyone to start their own business through robots. We can even "embody" in robotic dogs through virtual reality, chatting with children or walking dogs while on business trips.

Host (Brad Gerstner):

The CEO of Anthropic predicts that AI models and agents will generate $1 trillion in revenue by 2030. What do you think?

Jensen Huang:

I think he is too conservative. Anthropic's performance will far exceed that number. Because every enterprise software company will become a "value-added reseller" of these AI models in the future.

Host (Brad Gerstner):

So what is the "moat" for these companies?

Jensen Huang:

Deep specialization. Don't just create a general horizontal platform; instead, delve deeply into a specific vertical field and inject your expertise into the agents. Whoever connects with customers first will gain the data flywheel.

Host (Jason Calacanis):

You said three years ago: "You won't be replaced by AI, but by those who use AI." Do you have a different view on employment now?

Jensen Huang:

I'm not a doomsayer. While the number of drivers may decrease, "mobile assistants" will increase. Just like how the increase in autopilot in airplanes has led to more pilots. My advice to young people is: Become an expert in using AI. This requires artistry and knowing how to guide AI without completely constraining it.

Host (Brad Gerstner):

You once advised young people at Stanford to experience "pain and hardship." What do you suggest they study now? Will an English major have a brighter future than a computer science major?

Jensen Huang:

Strong skills in science, mathematics, and language are still crucial. Because language is the programming language of AI.

Look at the example of radiology: Ten years ago, some predicted that AI would make radiologists disappear. What happened? The demand for radiologists surged. Because AI made scans faster, hospitals could handle more patients and generate higher revenues.

A growing country needs more teachers and more experts, but each of them will have the "superpowers" granted by AI.

Host (Jason Calacanis):

Jensen, congratulations on your achievements. This has been a very positive and inspiring conversation.

Jensen Huang:

Thank you. We don't need to panic; we have the autonomy to choose how to create the future.

NVIDIA GTC26 Financial Analyst Q&A

Host:

Good morning, everyone. I hope you enjoyed yesterday's presentation. Although it ran a bit long, I think it was a very perfect summary. Now we will leave the time to you, focusing on your needs and questions. I will first hand over the time to Jensen Huang.

Jensen Huang:

As I mentioned yesterday, AI has recently experienced three turning points: the first is generative AI, the second is reasoning, and we are now at the third turning point—Agentic Systems.

These systems have the capability of "autonomous action," which is why they are called agents. You can set goals for them; they no longer just answer questions but execute tasks. One of the most popular applications is software writing. In your company, as well as in mine, engineers are using agentic systems all day long.

In the past, engineers were given a laptop when they joined; now they receive a laptop and Token. The Token budget has now become a tangible thing. If you hire an engineer with a salary of $300,000 a year, but he doesn't consume any Tokens at work, you have to ask what he is doing all day.

The future computer is no longer just a tool but a production device. Just like ASML's lithography machines, they produce products that can be sold. This is no different from the generators (Dynamo) that produced electricity long ago. Energy efficiency and production efficiency determine your income.

Every software company, every enterprise now needs an "OpenClaw strategy," just like we had to have a Linux strategy, an internet strategy, and a mobile cloud strategy back in the day.

Jensen Huang:

I want to update you on the situation regarding order visibility. A year ago, I mentioned that by 2026, we would have strong visibility of $500 billion in shipments for our Blackwell and Reuben.

Now it is March 2026, and we have over $1 trillion (1 Trillion Plus) in strong demand visibility for the Blackwell plus Reuben architecture. This includes confirmed demand forecasts and purchase orders.

Please note that this $1 trillion refers only to Blackwell and Reuben. I am not including Groq, independent CPUs, or other new products because I want to compare with last year's data.

Moreover, this number will continue to grow by the end of 2027. We have inventory and supply pipelines, and we can even complete orders and shipments within the same quarter. This is something companies that make ASICs (custom chips) cannot do because their delivery cycles are too long

Jensen Huang:

Last year (2025) was the "year of inference." We made it clear that there is only a slight connection between the price of computers and the cost of tokens. People buy computers to produce tokens. You bought an expensive computer, but it produces tokens at a very fast rate, resulting in you having the lowest-cost tokens.

This is also why we can maintain our gross margins. Each generation of our products offers higher value—specifically, "the number of tokens produced per second per watt." Customers prefer to buy the next generation of products at a higher price rather than buy old products at a low price. Installing Vera Rubin is smarter than continuing to buy Grace Blackwell because the value is higher.

Jensen Huang:

In 2025, we also expanded platform support. Anthropic and Meta SL have become our new partners. Current data from API inference service providers shows that open models have become the second most popular category of AI models globally, second only to OpenAI. And NVIDIA is the best platform for running open models worldwide.

We also work closely with cloud service providers (CSPs). We have CUDA in their clouds, which attracts all developers. We are the best sales force for cloud service providers, so you will see AWS, Google, Microsoft, and Oracle having the largest booths in our exhibition hall because they want to sell services to our developers.

Additionally, 40% of our business comes from non-CSP areas, such as regional clouds and industrial enterprises. Without NVIDIA's full-stack platform, you cannot reach this 40% market because they are buying a "platform," not just "chips."

Q&A Session

Questioner (Amelius Research, Ben Wright):

Jensen, the biggest concern is: is this investment worth it? Can the revenue growth of cloud service providers cover these huge expenses? When can we expect to see their revenue revisions?

Jensen Huang:

I really wish those AI companies were already public so you could see what I see. Historically, no startup has ever been able to add $1 billion or $2 billion in revenue every week like they are now.

The $2 trillion IT software industry is integrating OpenAI, Anthropic, and open models. The future IT industry will become the "distributors" of these models. I estimate that the IT industry will grow from the current $2 trillion to $8 trillion

All IT companies will either lease or produce Tokens in the future. Their business models will shift from software licensing to Token leasing. Although this will introduce sales costs (COGS), the value provided will be much higher. The revenue growth rate of OpenAI and Anthropic is like "growing a complete IT company in a month."

Questioner (Caner Fitzgerald, CJ Muse):

What kind of changes will Physical AI bring to your business?

Jensen Huang:

Currently, the growth rates of digital AI and physical AI are about the same. But in a few years, physical AI will reach a turning point; it must operate locally, at the edge of factories. Since the global $70 trillion industry involves physical atoms (rather than digital bits), physical AI will eventually account for 70% of our business.

In the future, computers will run 24/7. I hope engineers with an annual salary of $2000 a day can spend a $1000 Token budget daily. I want them to manage an entire fleet of intelligent agents to work for them.

Questioner (Bernstein, Stacy Rasgon):

Will Reuben be released with Groq? How is the evolution of inference workloads?

Colette Kress:

LPX (related to Groq) is expected in the second half of this year.

Jensen Huang:

Vera Rubin will ship earlier than Groq. Regarding computing architecture, there is a distinction between low latency (CPU) and high throughput (GPU). Groq is an extreme low-latency architecture, with almost the entire chip being SRAM. It is not flexible but very fast.

We will integrate Groq with Vera Rubin, using Groq to handle the final stages of autoregressive inference for language models. For free or standard-level inference, Vera Rubin is unbeatable. But for extremely high-end, very intelligent models, adding Groq will significantly enhance throughput.

It's like the iPhone or the automotive industry; as the market expands, there will be stratification: from the free tier to the geek tier (high-end tier at $50 per million Tokens).

Questioner (Bank of America, Vivek Arya):

In a $1 trillion market, what percentage do products like CPU and storage account for? Will Groq eat into the demand for HBM (High Bandwidth Memory)?

Jensen Huang:

We are the only company that can optimize across HBM, LPDDR5, and SRAM memory types. If all $1 trillion orders include Groq, the scale will become $1.25 trillion Storage is the second largest expense, with CPUs accounting for about 5%. Vera Rubin addresses the computational needs of "agents" — they not only need to reason but also to query memory, use tools, and run browsers.

We have harmoniously integrated all these functions into a liquid-cooled rack architecture, no longer a "Frankenstein."

Questioner (Goldman Sachs, Jim Schneider):

Token costs have been declining; will this trend level off?

Jensen Huang:

Token costs will continue to decline. At the same time, the "intelligence" of each Token will continue to rise. Evaluating AI factories must consider the "number of Tokens produced per watt." Any comparison that does not account for power consumption is misleading. We will continuously push the Pareto Frontier — that is, enabling factories to produce more and smarter Tokens at the same cost. This is one of the hardest problems in computer science.

Questioner (Evercore ISI, Mark Leatis):

What does the hybrid architecture of SSM (State Space Model) and Neimotron 3 mean for you?

Jensen Huang:

The beauty of NVIDIA's architecture is that it supports everything: Transformers, diffusion models, SSM, etc. Groq cannot run diffusion models, but we can. Neimotron 3 is designed to handle ultra-long contexts. We aim to advance AI technology, not just compete.

Questioner (UBS, Tim Aruri):

Some are concerned that NVIDIA is taking too much value from the ecosystem, making margins unsustainable. What do you think?

Jensen Huang:

If you continue to provide multiple times the productivity improvement, customers will be happy to work with you. It's like TSMC's wafers being the most expensive in the world but also the most valuable, so I'm willing to pay. The same goes for ASML. Those who say "my chip is 30% cheaper" simply do not understand AI. They do not grasp the overall economics of AI factories.

Questioner (Redburn, Tim Schultzander):

Your employee growth is slow, but the workload is increasing rapidly; how do you balance this?

Jensen Huang:

I have 60 direct reports. Our company structure reflects our product architecture. To update a generation of products every year, you must fully own the entire software stack, storage, and network. You cannot rely on piecing together others' technologies to achieve annual updates. We own everything from chips to the operating system (Dynamo), which allows us to make old software run perfectly on new systems from day one

Questioner (Last Questioner):

How will the demand for training evolve?

Jensen Huang:

Training has evolved from pre-training (kindergarten) to post-training (skill acquisition). Post-training requires reinforcement learning, tool usage, etc., and its computational intensity may be millions of times that of pre-training. In the future, pre-training will mainly use "synthetic data." I hope that in the future, 99% of computing power will be used for inference, as inference is the process of converting tokens into economic benefits.

This is why NVIDIA made a comprehensive bet on inference last year. Inference is "thinking" and "working," how could it be easy? Inference will only become increasingly difficult.

Jensen Huang:

Thank you all for coming to GTC