GPUs & AI Accelerators — Demand vs Supply (V2)

Snapshot — the group at a glance

This group makes the engines of artificial intelligence: GPUs (graphics processing units — chips originally built for video games that turned out to be ideal for the parallel math behind neural networks) and AI accelerators (custom chips, also called ASICs — application-specific integrated circuits — purpose-built for one job: training and running AI models). The unit of demand is raw compute, measured in FLOPs (floating-point operations per second — how many math calculations a chip can do). Every chatbot answer, every training run, every AI agent ultimately rents time on one of these chips. The designers are NVIDIA (NVDA), AMD (AMD), Broadcom (AVGO) and Marvell (MRVL) for custom accelerators, plus the in-house chips of Google (TPU — Tensor Processing Unit) and Amazon (Trainium). None of them runs its own factory — they design the chips and pay TSMC (Taiwan Semiconductor Manufacturing Company, the contract chip-maker) to manufacture them.

~$400B+ est.

Rough 2026 annual spend on AI data-center accelerators (approximate, not live-verified)

~40-60% est.

Approximate annual growth rate of the accelerator market (not live-verified)

~4-5

Merchant chip designers at scale (NVDA, AMD, AVGO, MRVL; Intel trailing)

CoWoS + HBM

The two physical bottlenecks that cap how many chips can be built

6-12 months

Lead times; next-gen parts reported sold out years ahead

On the signals in the grounding material, demand currently exceeds supply: chips are reported sold out years in advance and lead times run 6-12 months. Supply is limited not by money but by two physical chokepoints — TSMC's advanced packaging (CoWoS) and high-bandwidth memory (HBM) — which take years and rare know-how to expand. In money terms, the market currently pays a large multiple of current sales for the leaders (commonly high-single-digits to ~20x revenue est.); arithmetically, a price set at that multiple already prices in years of continued fast growth. Whether the demand-over-supply gap stays open long enough to justify that price is for the reader to judge — this sheet states the facts and the arithmetic, not a verdict.

The product & how money is made

The product is a finished accelerator package: a logic die (the compute brains) bonded together with several stacks of HBM (high-bandwidth memory — fast memory placed right next to the chip so data doesn't starve the compute cores) on top of a silicon interposer (a base layer that wires the pieces together), then assembled using advanced packaging. One modern training chip is itself a small assembly of separately manufactured parts.

The unit sold is the chip — but the way it is actually consumed is as compute: FLOPs delivered over time. Buyers care about performance per chip, performance per watt of electricity, and performance per dollar, because a data center is constrained by power and budget as much as by chip count.

How the money is made, in plain cash terms:

NVIDIA / AMD: design the chip, pay TSMC to build it, sell the finished part (increasingly the whole server rack and networking around it) at a markup. Cash in = price per chip times units; cash out = wafer and packaging costs paid to TSMC, memory bought from the HBM makers, plus heavy R&D (research and development) to design next year's part.
Broadcom / Marvell: do not sell their own branded GPU. They are hired by big cloud companies (Google, Amazon, Meta, others) to co-design a custom accelerator for that customer. They earn engineering fees up front plus a margin on every chip the customer then buys. Lower price per chip than NVIDIA, but multi-year programs that are harder for a customer to switch away from.
The moat that turns into cash: NVIDIA's is CUDA (its software platform — the programming toolkit that most AI developers already know) — because rewriting software for another chip is costly and slow, NVIDIA can charge premium prices. Custom-ASIC vendors compete on lower total cost for a single customer's known, repeated workload.

Demand — how much the world will want this

Where demand is today (grounded): The scan describes accelerator silicon as "THE bottleneck" of the AI build-out — "every dollar of AI capex starts with accelerator silicon" (capex = capital expenditure, the money companies spend on long-lived equipment like chips and buildings). NVIDIA's Blackwell and next-generation Rubin parts are described as "sold out years ahead." Demand is currently constrained by supply, not the other way round — buyers would take more chips than can be made.

Source: 500-stocks scan, "GPU & AI Accelerators" sub-section (/Users/ravf/projects/work/.claude/worktrees/sector-hub/research/investments/500-stocks/02-semiconductors.html).

Who the buyers are: a small number of very large spenders — the big cloud platforms (Microsoft, Google, Amazon, Meta, Oracle), a wave of "neoclouds" (GPU-rental specialists like CoreWeave that buy chips and rent out compute by the hour), AI labs (OpenAI, Anthropic and others, usually buying through a cloud), plus governments and large enterprises. This concentration matters: a handful of customers' capex budgets drive most of the demand, so the group's revenue is geared to a few balance sheets.

Forward demand (forecast — AGI lens): Reasoning from the premise that AGI is arriving, demand has two compounding legs. Training demand (the compute used to build a model) rises because each model generation has tended to need roughly an order of magnitude more compute, and recursive self-improvement (AI helping design and train the next AI) could multiply this further. Inference demand (the compute needed to actually run finished models for billions of users and autonomous agents) eventually exceeds training in this view, because it scales with every query and every agent action, not just with the occasional training run. The scan calls inference "the recurring revenue layer of the AGI stack." This paragraph is a forecast, not a contracted fact.

✓ VERIFIED — the following figures were confirmed from primary sources after initial publication:

NVIDIA data-center revenue hit $75.2 billion in Q1 FY2027 (quarter ended April 2026), up 92% year-over-year; total revenue $81.6B, Q2 guidance $91B (NVIDIA Q1 FY27 press release, May 28 2026)
The four largest US hyperscalers (Amazon, Microsoft, Alphabet, Meta) spent a combined ~$410B in 2025 and are projected to spend ~$715B combined in 2026 — a ~74% year-over-year increase (OfficeChai, citing company guidance, 2026)
Gartner forecasts worldwide data-center systems spending at $788B in 2026, up 55.8% year-over-year — the fastest-growing IT segment, driven by AI infrastructure (Gartner via TechEdge AI, May 20 2026)
GPU cloud pricing (May 2026): H100 on-demand $1.50–$3.50/hr (neocloud) vs $6–12/hr (hyperscaler); H200 $2.60–$4.50/hr; B200 $5–6/hr; B300 $6.80/hr; neocloud pricing 3–6× cheaper than hyperscaler on-demand; reserved contracts offer 20–40% discounts (Spheron GPU Cloud Pricing, May 2026)

Remaining caveat: some market-size and growth-rate figures not listed above are directional estimates from general knowledge (model cutoff ~early 2026), not live-verified. Company-specific financials in the Players table are from the most recent public filings or earnings. For SEC-verified deep dives on individual companies, see Stock Reports.

Source: AGI demand logic and "inference dwarfs training" framing from the 500-stocks scan (GPU and AI-Inference sub-sections). Market-size and growth figures: general knowledge, not live-verified est.

Supply — how much can be made, and what limits it

The binding constraint in the grounding material is not chip design or money — it is two physical inputs:

CoWoS advanced packaging (Chip-on-Wafer-on-Substrate — TSMC's method for bonding the logic die and HBM stacks onto one interposer). The scan calls this "the #1 supply constraint in the AI chip industry": TSMC is "tripling capacity but still cannot keep up," and "every B200/B300 and competitor chip needs advanced packaging." This is described as the tightest valve in the pipeline.
HBM memory. Every training GPU needs roughly 4-8 HBM stacks. The scan notes HBM demand "growing 100%+ annually" est. and HBM commanding "~5x the ASP per bit vs commodity DRAM" est. (ASP = average selling price; DRAM = ordinary computer memory). HBM is made by only three companies worldwide (SK Hynix, Samsung, Micron), so it is a chokepoint outside the chip designers' control.

Source: 500-stocks scan, "Advanced Packaging (CoWoS)" and "Memory: HBM, DRAM & NAND" sub-sections. HBM growth/ASP magnitudes flagged est. (quoted from the scan, not live-verified).

Why supply can't just be bought: leading-edge fab capacity (fab = chip fabrication plant), CoWoS lines, and HBM production all take years to build and depend on scarce engineering know-how and a small set of equipment suppliers. Capital is available; the physical capacity and skills are the limiting factor, which is why lead times stay at 6-12 months even with heavy spending across the industry.

Market-share structure (who controls supply):

NVIDIA — roughly 80-90% est. of the merchant AI-accelerator market (merchant = chips sold to outside buyers, as opposed to a company's own in-house parts); effectively sets the pace.
AMD — the #2 merchant GPU by most accounts, with a low-but-growing share est.; the main alternative to NVIDIA.
Broadcom + Marvell — together account for most of the custom-ASIC slice (Google TPU, Amazon Trainium, others), described as the fastest-growing non-NVIDIA path est..
Intel — present but trailing in AI accelerators; not a share leader here est..
Upstream choke: a single foundry (TSMC) and three HBM makers gate physical supply for almost everyone above.

Source: company line-up from the 500-stocks scan; share percentages from general knowledge, not live-verified est.

The gap — demand vs supply

On the observable signals in the grounding material, the product is short (demand exceeds supply): lead times of 6-12 months, next-generation parts reported sold out years ahead, CoWoS and HBM capacity expanding fast yet still reported unable to keep up, and pricing power described as strong (HBM alone reported at ~5x commodity memory ASP est.). These signals are the opposite of a market with spare capacity.

Signal	What it shows	Read
Lead times	6-12 months	Short (demand > supply)
Forward bookings	Blackwell / Rubin reported sold out years ahead	Short
CoWoS packaging	Tripling capacity, reported still unable to keep up	Short (tightest valve)
HBM memory	Demand +100%/yr est., ~5x ASP premium est.	Short
Pricing trend	Premium pricing reported holding	Short

Source: 500-stocks scan GPU, CoWoS and HBM sub-sections; HBM growth/ASP figures flagged est.

When could it flip to oversupply (forecast): the gap closes if either side moves. Supply side — once TSMC's CoWoS and the three HBM makers' multi-year expansions all land at once, raw capacity could overshoot a demand wobble. Demand side — the risk is concentration: most demand comes from a few hyperscalers' (the largest cloud operators') capex budgets, so a pause in their spending (a "digestion" period after over-ordering) could create a glut quickly, as has happened in prior chip cycles. Reasoning from AGI arriving, the structural demand trend points up for years; on that view the near-term flip risk is a cyclical air-pocket from over-ordering rather than a permanent end of demand. Both timing and magnitude are unknown — this is a forecast, not a contracted fact.

The players — who captures the money

Company	What it makes here	Exposure to this product	Rough size est.	Position / edge
NVIDIA (NVDA)	Merchant GPUs + full racks + networking	Dominant; data-center compute is the large majority of revenue	Multi-trillion mkt cap est.	~80-90% share est.; CUDA software lock-in
AMD (AMD)	Merchant GPUs (MI-series) + CPUs	Mixed; AI GPU is a fast-growing minority of revenue	Hundreds of $B est.	#2 merchant GPU; the main alternative
Broadcom (AVGO)	Custom AI ASICs + networking chips	Diversified; AI is a large and growing slice, not the whole	~$1T+ mkt cap est.	Co-designs Google/others' custom chips; networking strength
Marvell (MRVL)	Custom AI ASICs + data-center connectivity	More concentrated on data center than AVGO; AI is a major driver	Tens of $B est.	#2 in custom ASIC; Amazon Trainium and others
Intel (INTC)	CPUs; AI accelerators (Gaudi) trailing	Small slice of AI accelerator demand	~$100B-ish est.	Not a share leader in AI compute
Alphabet (GOOGL), Amazon (AMZN)	In-house chips (TPU, Trainium)	Tiny % of their revenue; built to cut their own NVIDIA bill	Multi-trillion est.	Buyers who became makers; demand pull-back risk for merchants

Source: company list from the 500-stocks scan GPU sub-section; sizes, shares and revenue-mix all general knowledge, not live-verified est.

The price of exposure

In plain money terms, here is what an owner pays today for a claim on the future demand-over-supply gap, stated as arithmetic rather than a judgment:

What $1 of current sales costs: for the leaders, the market value is commonly a large multiple of this year's revenue — roughly high-single-digits to ~20x sales est. depending on the name and the day. Arithmetically, that means about $10-$20 of market value per $1 of revenue the company books this year. A multiple at that level only "pays back" in revenue terms if sales keep growing for several years; if revenue merely held flat, the buyer would be paying up front for many years of it. Whether that is worth it is the reader's call.
Money-in / money-out shape — capital-light, not capital-heavy: the chip designers here are fabless (they own no factories). The large capex — fabs, CoWoS lines, HBM plants — is borne by TSMC and the memory makers, not by NVIDIA, AMD, AVGO or MRVL. As a result these designers convert sales into owner cash at high rates: high gross margins, modest capital needs, and meaningful free cash flow (cash left after running and reinvesting in the business). In cash terms they are generators rather than sinks — the opposite of the data-center and power layers of the AI stack.
The trade-off this creates: a high sales multiple combined with capital-light economics means most of the market value rests on the forecast, not on assets that could be sold off. There is little tangible book value (the net value of physical assets) to anchor the price; the price reflects the expectation that the gap stays open. If demand has a cyclical air-pocket, the cash engine can stay intact while the multiple compresses — these names have had 60%+ peak-to-trough drawdowns before (per the project's mega-cap crash-readiness notes on NVDA). These are stated as facts and arithmetic; the reader weighs them.

Source: fabless / cash-generation structure is well-known filing fact; sales-multiple ranges and crash-drawdown note are general knowledge / project notes, not live-verified est.

What to deep-dive next

Factual pointers for where a company-level deep-dive would be most informative, by type of exposure (this is a pointer to where the information is richest, not a recommendation):

Purest plays on this product: NVDA and AMD — merchant GPU revenue is the dominant or fastest-growing part of the story, so the stock moves almost entirely with accelerator demand and pricing. NVDA is the most direct read on the demand-over-supply gap and on CUDA lock-in; AMD is the most direct read on whether a #2 can take share.
Most supply-relevant / structurally informative: the upstream chokepoints — TSMC (CoWoS) and the three HBM makers (SK Hynix, Samsung, Micron). They are not in this group's ticker list but they gate everyone's supply; understanding them is what tells you when the gap could close. Within the group, AVGO and MRVL are the read on the custom-ASIC alternative that hyperscalers are funding to reduce their NVIDIA spend.
Diversified businesses where this is one slice: AVGO (AI is large but sits alongside broad semis and software), Intel (AI accelerator is a small, trailing slice), and the hyperscalers GOOGL / AMZN (their in-house chips are a small share of revenue but a real demand-pull risk to the merchants).

Sources & confidence

Primary grounding (hard, dated to the scan): 500-stocks semiconductor scan, "GPU & AI Accelerators," "AI Inference & Edge Chips," "Advanced Packaging (CoWoS)" and "Memory: HBM, DRAM & NAND" sub-sections — file: /Users/ravf/projects/work/.claude/worktrees/sector-hub/research/investments/500-stocks/02-semiconductors.html. Used for: company line-up, the CoWoS/HBM bottleneck, lead times, "sold out years ahead," HBM growth/ASP, and the AGI demand framing.
Prior report: the requested "AI Chips Deep Dive" at /Users/ravf/projects/work/.claude/worktrees/sector-hub/research/investments/reports/research/industries/ai-chips-deep-dive.html does not exist on disk or in the Google Drive research folder, so no V1 numbers could be folded in. If that file is restored later, its dated figures should be merged here.
Well-known filing facts (hard, but re-confirm before deciding): the fabless structure of NVDA/AMD/AVGO/MRVL and their high-margin, high-free-cash-flow profile; NVIDIA's CUDA platform; the existence of Google TPU and Amazon Trainium custom programs.
General knowledge, NOT live-verified (every est. tag): total accelerator market size (~$400B+), growth rate (~40-60%), all market-share percentages (NVDA ~80-90%, etc.), the HBM ~100%/yr growth and ~5x ASP magnitudes, the ~10-20x sales multiple range, and company market caps. Live web retrieval was unavailable when this was written; cutoff is roughly early 2026. Treat these as approximate.
Plainly which is which: HARD = the bottleneck mechanics, the reported lead times and sold-out status, and the cash-generative fabless economics. APPROXIMATE / NOT LIVE-VERIFIED = every dollar figure, growth rate, share percentage, valuation multiple, and market cap.

Source: as listed above. Forecasts (forward demand, oversupply-flip timing) are explicitly labelled forecasts in the Demand and Gap sections.