The announcements highlight surging AI infrastructure demand, major architecture upgrades, and expansion into autonomous vehicles and enterprise AI software, reinforcing NVIDIA's central role in the AI computing ecosystem.
Key Points
-
$1 Trillion AI System Demand
-
Jensen Huang said NVIDIA expects about $1 trillion in purchase orders for Blackwell and Rubin AI systems through calendar year (CY) 2027.
-
Around 60% of demand is expected from hyperscale cloud companies (large cloud providers).
-
This is double the prior projection of $500 billion through CY 2026.
-
Revenue Impact
-
Groq-Based Rack in Rubin Platform
-
A Groq 3 LPX rack using high-speed SRAM memory instead of HBM will be introduced in Q3 2026 as part of the Vera Rubin platform.
-
The system reportedly delivers up to 35× improvement in tokens-per-watt performance compared with Blackwell systems.
-
Without Groq acceleration, Rubin provides roughly 10× efficiency improvement vs Blackwell.
-
Ampere GPU Pricing
-
Graphics Technology
-
NVIDIA unveiled DLSS 5, the next generation of its AI-powered graphics rendering technology.
-
Demonstrations showed significant improvements in visual realism and performance in side-by-side comparisons.
-
Future Architecture Roadmap
-
NVIDIA previewed its Feynman architecture, expected to launch around CY 2028.
-
It will include a new CPU called Rosa.
-
Enterprise AI Software
-
Autonomous Vehicle Partnerships
Technical Vocabulary
-
Blackwell – A recent NVIDIA GPU architecture designed for large-scale AI workloads, particularly training and running advanced AI models in data centers.
-
Rubin (Vera Rubin Platform) – NVIDIA's next-generation AI computing architecture following Blackwell, intended for future AI infrastructure and large data-center systems.
-
Feynman Architecture – A future NVIDIA chip platform planned for around 2028, expected to succeed the Rubin generation.
-
Ampere – An earlier NVIDIA GPU architecture used widely in AI, cloud computing, and data-center workloads.
-
Tokens-per-Watt – A metric that measures AI efficiency by calculating how many AI text tokens can be processed per unit of electrical power.
-
Inference – The stage where a trained AI model produces outputs or predictions, such as generating text or answering questions.
-
Training – The process where AI models learn patterns from large datasets to build a model capable of making predictions or generating outputs.
-
HBM (High Bandwidth Memory) – A high-speed stacked memory technology used in advanced GPUs to handle extremely large data flows required for AI computing.
-
SRAM (Static Random Access Memory) – A very fast type of memory typically used for cache or specialized processors; faster but more expensive and lower capacity than other memory types.
-
Rack – A physical unit in a data center that holds multiple servers, GPUs, storage, and networking components in a standardized frame.
-
Hyperscalers – Very large cloud computing companies that operate massive global data centers and purchase large amounts of AI hardware.
-
DLSS (Deep Learning Super Sampling) – NVIDIA technology that uses AI to generate higher-resolution graphics from lower-resolution renders, improving game performance while maintaining image quality.
-
NemoClaw – NVIDIA software designed to make OpenClaw AI systems secure and ready for enterprise deployment.
-
CY (Calendar Year) – A financial reporting term referring to the period from January to December of a given year.
-
Physical AI – Artificial intelligence systems used in real-world machines such as robots, autonomous vehicles, or industrial equipment.
------------------------------
Carlos Salas
Portfolio Manager & Freelance Investment Research Consultant
------------------------------