Technology and Innovation Community

 View Only

GenAI: Inference Era is Here

  • 1.  GenAI: Inference Era is Here

    Posted 4 days ago

     Please see below some insights on inference as new secular trend within the GenAI megatrend  along with a table summarizing key players. Hope is helpful and do not hesitate to comment or provide any corrections.

    Inference New Era
    The AI value chain is undergoing a structural rotation from training-centric compute (2023–2025) to inference-dominant workloads (2025–2028). This is not a cyclical slowdown but a change in the economic center of gravity.
    What just ended (Training Era):
    • AI economics were dominated by one-off, capital-intensive training runs.
    • Performance mattered more than cost: Nvidia's general-purpose GPUs (H100) won decisively.
    • Result: Nvidia data center revenue exploded ($15B → $115B in two years).
    • Ecosystem beneficiaries: TSMC (fabs), SK Hynix (HBM), advanced packaging, power & cooling.
    • But training is episodic, not linear. Each new model is a discrete CapEx cycle with diminishing marginal gains due to data saturation.
    What's starting now (Inference Era)
    • Inference is now the majority of workloads, driven by:
      • Agentic workflows (one task → hundreds/thousands of model calls)
      • Continuous, always-on AI usage instead of single queries
    • Inference economics are brutally price-sensitive:
      • $0.01 vs $0.001 per query = 10× annual cost difference at scale
      • Switching costs are minimal → commoditization risk
    • This favours:
      • Specialized silicon (ASICs, TPUs, LPUs) over expensive general-purpose GPUs.
      • Distributed, latency-optimized infrastructure (edge + on-prem).
    • Nvidia doesn't disappear-but faces a margin and expectations reset as the "Nvidia tax" becomes untenable at inference scale.
    Capital flows rotate, not disappear
    Training-era profits don't vanish; they re-distribute: 
    • From GPUs → custom silicon designers
    • From centralized clusters → packaging, optics, cooling, power, and edge infrastructure
    • From variable renewables → baseload nuclear and gas



    ------------------------------
    Carlos Salas
    Portfolio Manager & Freelance Investment Research Consultant
    ------------------------------