Data Center Networking

How switches work, Ethernet vs InfiniBand, NVIDIA's vertical integration threat, and what AGI means for this industry. | 2026-04-16

Why This Matters for Investing

When thousands of GPUs train an AI model, they need to constantly exchange data with each other. The network connecting them is often the bottleneck — GPUs sit idle waiting for data to arrive. This makes networking one of the most critical (and expensive) components of AI infrastructure. The central investment question: as AI scales, does the networking market grow and who captures the value? Arista (ANET), Broadcom (AVGO), and NVIDIA (NVDA) are all competing for this market, with very different strategies.

Key stocks: ANET (Arista — Ethernet switch vendor), AVGO (Broadcom — switch ASICs), NVDA (InfiniBand + Spectrum-X), LITE/COHR (optical transceivers that plug into switches), MRVL (custom networking ASICs).

1. What Is a Network Switch?

A network switch is a physical box that connects computers together so they can send data to each other. Think of it like a postal sorting facility: data packets arrive at the switch, the switch reads the destination address on each packet, and forwards it out the correct port to reach the right computer.

Without switches, you would need a separate cable from every computer to every other computer. With 1,000 computers, that's ~500,000 cables. A switch lets you plug all 1,000 computers into a relatively small number of switches, and the switches figure out how to route the data.

The physical object:

A data center switch looks like a flat metal box, about the size of a pizza box, that sits in a standard server rack. The front has rows of ports — typically 32 to 64 ports — where fiber optic cables plug in. Each port can handle 100, 400, or 800 gigabits per second. A single high-end switch can move over 50 terabits per second of total throughput — that's roughly the equivalent of streaming 10 million 4K movies simultaneously.

What happens when a packet arrives

Packet arrives on one of the ports via fiber optic cable
The switch ASIC (a custom chip inside the switch) reads the packet header in nanoseconds
Lookup: The ASIC checks its forwarding table to find which output port leads to the destination
Forward: The packet is sent out the correct port, also in nanoseconds
Total time inside the switch: typically 300-500 nanoseconds (billionths of a second)

The entire process is done in hardware, on the ASIC. Software handles the control plane (building the forwarding tables, managing the switch), but the actual packet forwarding is pure hardware — that's why it's so fast.

2. How a Data Center Network Is Built

A modern data center might have 100,000+ servers. You can't plug them all into one switch (the biggest switches have ~64 ports). So data centers use a layered architecture called leaf-spine:

┌──────────┐ ┌──────────┐ ┌──────────┐ │ Spine 1 │ │ Spine 2 │ │ Spine 3 │ ← Spine layer └────┬─────┘ └─────┬────┘ └────┬─────┘ (connects leaves) │ │ │ ┌──────────┼──────────────┼─────────────┼──────────┐ │ │ │ │ │ ┌────┴───┐ ┌───┴────┐ ┌─────┴────┐ ┌───┴────┐ ┌───┴────┐ │ Leaf 1 │ │ Leaf 2 │ │ Leaf 3 │ │ Leaf 4 │ │ Leaf 5 │ ← Leaf layer └───┬────┘ └───┬────┘ └────┬─────┘ └───┬────┘ └───┬────┘ (top of rack) │ │ │ │ │ ┌──┴──┐ ┌──┴──┐ ┌───┴──┐ ┌──┴──┐ ┌──┴──┐ │Rack1│ │Rack2│ │Rack 3│ │Rack4│ │Rack5│ ← Server racks │ GPU │ │ GPU │ │ GPU │ │ GPU │ │ GPU │ (32-48 servers │ GPU │ │ GPU │ │ GPU │ │ GPU │ │ GPU │ per rack) │ GPU │ │ GPU │ │ GPU │ │ GPU │ │ GPU │ └─────┘ └─────┘ └──────┘ └─────┘ └─────┘

Leaf switches (also called "top-of-rack" or ToR switches): Sit at the top of each server rack. Every server in the rack plugs into this switch. A typical leaf switch has 48 downward-facing ports (to servers) and 8-16 upward-facing ports (to spine switches).
Spine switches: Connect all the leaf switches together. Every leaf switch connects to every spine switch. This means any server can reach any other server through at most 2 hops (leaf → spine → leaf).

Two types of traffic

North-South Traffic

Data entering or leaving the data center — user requests coming in from the internet, responses going back out. This is what traditional web services generate.

East-West Traffic

Data moving between servers inside the data center. This is what AI training generates — thousands of GPUs constantly exchanging gradient updates. In AI clusters, east-west traffic dominates: 80-90%+ of all data stays inside the cluster.

This distinction matters because AI is dramatically shifting the traffic pattern. Traditional web workloads (serving search results, loading Instagram) are mostly north-south. AI training is almost entirely east-west. The more east-west traffic there is, the more spine switches you need, the higher-bandwidth your links need to be, and the more money gets spent on networking.

3. What's Inside a Switch? (Hardware vs Software)

This is one of the user's key questions: is a switch a hardware product or a software product? The answer: it's a hardware box, but the value increasingly lives in the software.

The hardware components

Component	What It Does	Who Makes It	Cost Share
Switch ASIC	The brain. Reads packet headers, does forwarding lookups, makes switching decisions. All in hardware at line rate.	Broadcom (dominant), NVIDIA (Spectrum), Marvell, Intel	30-40%
TCAM memory	Ternary Content-Addressable Memory. Stores forwarding/routing tables for ultra-fast lookups. Specialized memory that can search all entries simultaneously.	Various memory vendors	10-15%
Packet buffer memory	Temporarily holds packets when output ports are congested. Important for handling traffic bursts.	HBM or SRAM vendors	5-10%
Optical transceiver ports	The physical ports where fiber optic cables plug in. Convert between electrical signals (inside the switch) and light (on the fiber cable).	Lumentum (LITE), Coherent (COHR), InnoLight, etc.	25-35%
PCB, chassis, fans, PSU	Circuit board, metal enclosure, cooling, power supply. Standard electronics manufacturing.	Contract manufacturers	10-15%

The software: the Network Operating System (NOS)

The switch ASIC handles the fast path (forwarding packets). But you also need software to:

Build and maintain routing/forwarding tables
Handle control protocols (BGP, OSPF, VXLAN, EVPN)
Monitor the network, detect failures, reroute traffic
Provide a management interface (CLI, API, telemetry)
Implement quality-of-service, security policies, traffic engineering

This software is called the Network Operating System (NOS). It's where Arista's competitive advantage lives.

Arista EOS

Linux-based, single monolithic binary
Every process runs in its own protected memory space
Can upgrade without rebooting (In-Service Software Upgrade)
Programmable via Python/REST APIs
Single codebase across all products — no fragmentation
CloudVision: centralized management for thousands of switches

Why hyperscalers prefer it: programmable, reliable, modern architecture.

Cisco IOS/NX-OS (legacy)

Decades-old codebase with accumulated complexity
Multiple product lines with different software versions
Historically less programmable
Vendor lock-in through proprietary features
Still dominant in enterprise, losing share in cloud/DC

Why cloud customers left: complexity, bugs, vendor lock-in.

So is it hardware or software?

The switch is physically a hardware product — you buy a box. But the ASIC inside is a commodity that anyone can buy from Broadcom. What differentiates Arista from a generic "white box" switch is EOS — the software. This is why Arista's gross margin is ~64%, which is remarkably high for a hardware company and looks more like a software margin. The software is where the value and the moat live.

That said, you still need to design the hardware, manage the supply chain, qualify the optics, test the whole system. It's not pure software. The business model is: sell hardware boxes at healthy margins, provide ongoing software support and subscriptions. Increasingly, Arista is also selling software-only licenses (CloudVision, DANZ) that recur annually.

4. What Is Ethernet?

Ethernet is a networking protocol — a set of rules for how computers package and send data to each other. It was invented at Xerox PARC in 1973 and has been the dominant local networking standard for over 50 years.

Ethernet is defined by the IEEE 802.3 standard. It's an open standard — anyone can build Ethernet equipment, and products from different vendors work together. This is in stark contrast to InfiniBand (discussed below), which is effectively controlled by NVIDIA.

Speed evolution

Year	Speed	Name	Context
1980	10 Mbps	Ethernet	Original Xerox/Intel/DEC specification
1995	100 Mbps	Fast Ethernet	Enough for basic web browsing
1999	1 Gbps	Gigabit Ethernet	Standard for home/office networks today
2006	10 Gbps	10 GigE	First generation of data center Ethernet
2010	40 Gbps	40 GigE	Data center spine links
2015	25/100 Gbps	25/100 GigE	Server-to-switch (25G), spine (100G)
2018	200/400 Gbps	200/400 GigE	AI-era data center networking
2024	800 Gbps	800 GigE	AI backend networks, now shipping
~2026	1.6 Tbps	1.6T Ethernet	Next generation, in development

The key takeaway: Ethernet keeps doubling in speed every 3-4 years. It started as a 10 Mbps office protocol and is now running at 800 Gbps in AI data centers — an 80,000x increase in speed. This relentless pace of improvement is part of why Ethernet keeps winning: it's a moving target that's hard for alternatives to outrun.

Why Ethernet's ubiquity matters:

Every network engineer in the world knows Ethernet. Every switch, router, NIC, server, and operating system supports it. There are thousands of vendors, mature tooling, abundant talent, and decades of operational experience. This installed base and ecosystem is an enormous moat. For any competing technology (like InfiniBand), "being slightly better technically" is not enough — you have to be dramatically better to overcome the switching costs and ecosystem advantage of Ethernet.

5. What Is InfiniBand?

InfiniBand is a different networking protocol, designed specifically for high-performance computing (HPC). It was created in the early 2000s by a consortium including Intel, IBM, Sun, and others. The original goal was to replace the PCI bus inside computers, but it evolved into a network interconnect for supercomputers.

The key company: Mellanox Technologies, an Israeli company that became the dominant InfiniBand vendor. NVIDIA acquired Mellanox in 2020 for $6.9 billion. This acquisition gave NVIDIA control over InfiniBand — the leading high-performance networking technology.

How InfiniBand differs from Ethernet

Property	Ethernet	InfiniBand
Governance	Open standard (IEEE)	NVIDIA-controlled (IBTA)
Ecosystem	Thousands of vendors	Essentially NVIDIA only
Latency	~1-2 microseconds	~0.5-0.6 microseconds
RDMA support	RoCE v2 (add-on, not native)	Native, built-in from day one
Congestion management	ECN-based (reactive)	Credit-based (proactive)
Adaptive routing	Limited (ECMP hashing)	Built-in, dynamic load balancing
Price	Competitive (many vendors)	Premium (monopoly pricing)
Scalability	Proven at massive scale (100K+ nodes)	Historically limited to ~10K nodes
Interoperability	Multi-vendor	NVIDIA hardware only
Cost per port	Lower (commodity)	Higher (premium)

RDMA: The key technical difference

RDMA (Remote Direct Memory Access) is the single most important technical concept in this whole debate. Here's what it means:

In normal networking, when Computer A wants to send data to Computer B:

The application on A asks the operating system to send data
The OS copies the data from application memory to a kernel buffer
The OS hands the data to the network card (NIC)
The NIC sends it over the network
Computer B's NIC receives it, puts it in a kernel buffer
Computer B's OS copies it from kernel buffer to application memory
The application on B can now use the data

That's multiple memory copies and multiple context switches between application and operating system. Each one adds latency and burns CPU cycles.

With RDMA:

Computer A's NIC reads data directly from application memory
Sends it over the network
Computer B's NIC writes data directly into application memory

Zero CPU involvement. Zero memory copies. Zero OS overhead. The network cards talk directly to application memory, bypassing the entire operating system. This dramatically reduces latency and frees the CPU to do other work.

Why RDMA matters for AI training:

During AI training, GPUs need to exchange gradient updates thousands of times per second. Each exchange involves reading a chunk of GPU memory, sending it across the network, and writing it into another GPU's memory. With RDMA, this happens directly — GPU memory to network to GPU memory — without bothering the CPU or operating system. This can reduce communication time by 30-50% compared to traditional TCP/IP networking. When you have 10,000 GPUs and communication time is the bottleneck, that difference is enormous.

InfiniBand has had RDMA built in since day one. Ethernet added it later as an extension called RoCE v2 (RDMA over Converged Ethernet). RoCE v2 works, but it requires careful network configuration (lossless Ethernet with Priority Flow Control) that traditional Ethernet doesn't need. It's an afterthought bolted onto Ethernet, whereas in InfiniBand, RDMA is the native mode of operation.

6. Why AI Training Creates Massive Networking Demand

To understand why networking is so important for AI, you need to understand one operation: all-reduce.

How AI training works (simplified)

You have a giant neural network (billions of parameters = hundreds of gigabytes of weights)
The model is too big for one GPU, so you split it across hundreds or thousands of GPUs
Each GPU processes a different batch of training data and computes gradient updates
All-reduce: Every GPU needs to share its gradients with every other GPU, so they all end up with the same averaged gradient. This is the step that hammers the network.
Each GPU applies the averaged gradient to update its copy of the model weights
Repeat billions of times

The all-reduce step is a collective communication operation where every GPU sends data to every other GPU. In a cluster of N GPUs, the total data exchanged scales with the model size multiplied by the number of GPUs. For a model like GPT-4 with a trillion parameters, training on 25,000 GPUs, the all-reduce step might need to move hundreds of terabytes of data per training iteration. And there are millions of iterations.

The network is the bottleneck

GPUs are incredibly fast at computation — a single H100 can do 2,000 teraflops. But that computation is useless if the GPU is sitting idle, waiting for gradient data to arrive from other GPUs. In large training runs, 30-50% of the total training time can be spent waiting for network communication. Every microsecond of extra network latency, multiplied by billions of iterations, translates into weeks of extra training time and millions of dollars in electricity and GPU rental costs.

This is why NVIDIA pushes InfiniBand and Spectrum-X so hard, and why hyperscalers spend billions on networking. The network isn't a commodity utility — it's a core performance lever for AI training.

The numbers

Cluster Size	Approximate Network Bandwidth Needed	Number of Switches	Network Cost
1,000 GPUs	~400 Tbps aggregate	~50-80	~$20-40M
10,000 GPUs	~4 Pbps aggregate	~500-800	~$200-400M
100,000 GPUs	~40 Pbps aggregate	~5,000-8,000	~$2-4B

Networking is typically 10-15% of the total cost of an AI cluster. For a $10B GPU cluster, that's $1-1.5B spent on switches, cables, and optics. This is a big market.

7. Ethernet vs InfiniBand for AI — The Battle

This is the central competitive question in data center networking today. Two networking technologies are fighting to connect AI clusters:

Team InfiniBand (NVIDIA)

Lower latency (~0.5μs vs ~1-2μs)
Native RDMA — just works
Built-in congestion management (credit-based flow control)
Adaptive routing (dynamic load balancing across paths)
Proven in supercomputers for 20+ years
NVIDIA's DGX SuperPOD systems use InfiniBand
~70% of the world's top 10 supercomputers use InfiniBand

Downside: vendor lock-in to NVIDIA. Premium pricing. Limited scale.

Team Ethernet (Arista, Broadcom, the world)

Open standard — multi-vendor, competitive pricing
Massive existing ecosystem and operational expertise
RoCE v2 brings RDMA to Ethernet
Proven at massive scale (hyperscaler data centers)
Ultra Ethernet Consortium pushing AI-optimized Ethernet
Google, Meta, Microsoft building AI clusters on Ethernet
800G and 1.6T speeds coming fast

Downside: higher latency. RDMA is bolt-on, not native. Congestion management is harder.

Who uses what today?

Customer	AI Network Choice	Why
NVIDIA (DGX systems)	InfiniBand	They own it. Optimized for their GPUs. Maximum control.
Meta	Ethernet	Builds own network. Uses Arista switches + Broadcom ASICs. Open ecosystem.
Microsoft Azure	Both	InfiniBand for biggest AI clusters (NVIDIA DGX), Ethernet for everything else.
Google	Custom (Ethernet-based)	Builds own switches and TPU interconnect. Doesn't buy from NVIDIA or Arista.
Amazon AWS	Custom (EFA)	Elastic Fabric Adapter — proprietary, Ethernet-based. Nitro networking.
Oracle Cloud	InfiniBand	Differentiated on InfiniBand for HPC customers (RDMA).
xAI (Elon Musk)	InfiniBand	Used InfiniBand for 100K GPU Colossus cluster (NVIDIA-supplied).
CoreWeave	InfiniBand	NVIDIA partner, uses NVIDIA's full stack including networking.

The pattern: NVIDIA's own customers and partners use InfiniBand. The largest independent hyperscalers build on Ethernet. The question is which camp grows faster.

The trend is shifting toward Ethernet for AI

Three forces are pushing the market toward Ethernet:

The Ultra Ethernet Consortium (UEC) — founded in 2023 by AMD, Arista, Broadcom, Cisco, Google, HPE, Intel, Meta, Microsoft, and others. Their explicit goal: make Ethernet work as well as InfiniBand for AI workloads. They're defining new standards for AI-optimized congestion control, adaptive routing, and multipath transport. The 1.0 specification was released in 2024.
Hyperscaler self-interest: Google, Meta, Microsoft, and Amazon do not want to be dependent on NVIDIA for networking. They're investing heavily in Ethernet-based AI networks to maintain optionality and avoid vendor lock-in.
Scale: The largest planned AI clusters (100,000+ GPUs) are pushing beyond InfiniBand's traditional comfort zone. Ethernet's fat-tree and Clos architectures are proven at this scale; InfiniBand is less proven.

Bottom line: InfiniBand is technically superior for small-to-medium AI clusters. But Ethernet is "good enough" and getting better fast, and the world doesn't want NVIDIA to own the network too. The structural forces favor Ethernet long-term.

8. What Is NVIDIA Spectrum-X?

Here's where it gets interesting — and where the threat to Arista gets real.

NVIDIA saw the Ethernet trend coming. Their response: if the world is going to use Ethernet for AI instead of InfiniBand, NVIDIA will make its own Ethernet networking product. That product is Spectrum-X.

What Spectrum-X actually is

Spectrum-X is NVIDIA's complete Ethernet networking platform for AI, consisting of three components:

Component	What It Is	What It Replaces
Spectrum-4 switch ASIC	NVIDIA's own 51.2 Tbps Ethernet switch chip. Competitive with Broadcom's Tomahawk 5.	Broadcom Tomahawk (used by Arista)
BlueField-3 DPU	Data Processing Unit — a smart NIC that offloads networking, security, and storage from the CPU. Sits in each server.	Standard NICs (Mellanox ConnectX)
NVIDIA networking software	AI-optimized congestion control, adaptive routing, and telemetry. Designed specifically for GPU-to-GPU communication patterns.	Arista EOS / standard Ethernet software

NVIDIA's pitch: "Spectrum-X delivers 1.6x the effective AI performance of traditional Ethernet at the same cost." They claim this by optimizing the entire stack — switch ASIC, NIC, and software — specifically for AI traffic patterns (many-to-many GPU communication, bursty traffic, large messages).

Why this threatens Arista:

If you're a company building an AI cluster and you're already buying NVIDIA GPUs, NVIDIA now says: "Buy our switches too. They work better with our GPUs because we optimize the entire stack end-to-end." This is the same vertical integration playbook that Apple uses (we make the chip AND the software AND the hardware, so they all work together perfectly).

If Spectrum-X succeeds, Arista loses the most valuable part of the networking market — AI backend networks. Arista would still sell switches for non-AI workloads (enterprise, cloud, campus), but the AI premium growth driver would belong to NVIDIA.

Is Spectrum-X actually succeeding?

Early signs are mixed:

In NVIDIA's favor: Spectrum-X revenue reached $1B+ annualized by late 2025. NVIDIA reports strong demand from cloud service providers. Several major customers have deployed Spectrum-X clusters.
Against NVIDIA: The largest hyperscalers (Google, Meta, Microsoft for non-NVIDIA clusters) are not buying Spectrum-X. They prefer to control their own network stack. Spectrum-X adoption is strongest among NVIDIA's existing DGX/HGX customers who are already locked into the NVIDIA ecosystem.
The wildcard: NVIDIA's upcoming Vera Rubin platform (2026-2027) will integrate the NIC and switch into a tighter GPU+network system. If this works, the "buy the whole stack from NVIDIA" argument gets even stronger.

9. The Switch ASIC Market — Broadcom's Hidden Monopoly

Behind Arista's switches lies another company with enormous market power: Broadcom.

Broadcom designs the switch ASICs — the custom chips that do the actual packet forwarding inside the switch. Their two main product lines:

Product Line	Use Case	Latest Generation	Throughput	Key Feature
Tomahawk	High-bandwidth, low-latency switching for data center fabrics	Tomahawk 5 (2024)	51.2 Tbps	Maximum bandwidth per chip. Used in spine switches and AI backend networks.
Jericho	Deep-buffer routing for WAN and peering	Jericho3-AI (2024)	38.4 Tbps	Large packet buffers + routing. Jericho3-AI adds AI traffic optimization features.
Memory Memory Memory Trident	Feature-rich switching for enterprise/campus	Trident 5	12.8 Tbps	Rich feature set (ACLs, QoS, monitoring). Not used in AI clusters.

Broadcom's switch ASIC market share in data centers is estimated at 70-80%+. Arista, Cisco, and most other switch vendors all buy Broadcom ASICs. The main alternatives are NVIDIA's Spectrum (used only in NVIDIA switches) and Marvell's Teralynx (smaller market share).

The supply chain:

Broadcom designs the ASIC → TSMC manufactures it → Broadcom sells it to Arista → Arista combines it with their EOS software, ports, optics, and chassis → Arista sells the complete switch to Microsoft, Meta, etc. At each step, value is added. Broadcom captures 30-40% of the switch BOM. Arista captures the rest through system integration and software. This is why Arista and Broadcom are symbiotic — Arista needs Broadcom's chips, Broadcom needs Arista's market access.

10. The Competitive Landscape

Company	Role	AI Networking Strategy	Market Cap	Networking Revenue
Arista (ANET)	Ethernet switch vendor	Partner with Broadcom. Best NOS software. AI-optimized features in EOS. Targets $3.25B AI revenue by 2026.	~$120B	$9B total
NVIDIA (NVDA)	GPU + networking	Vertical integration. InfiniBand for loyal customers. Spectrum-X Ethernet to capture the Ethernet shift. Own the full stack.	~$4.4T	~$15B networking
Broadcom (AVGO)	Switch ASIC + NIC vendor	Sell ASICs to everyone. Tomahawk 5 and Jericho3-AI for AI. Also building custom AI chips (XPUs) for hyperscalers.	~$1.1T	~$15B networking
Cisco (CSCO)	Legacy network vendor	Silicon One chip. Trying to stay relevant. Acquired for AI/ML networking startups. Still dominant in enterprise but losing cloud/DC share to Arista.	~$250B	~$14B switching
Juniper (HPE)	Enterprise/SP networking	Acquired by HPE for $14B (2024). Focused on enterprise and service providers. Not a major AI networking player.	(now HPE)	~$5B
Marvell (MRVL)	ASIC + NIC vendor	Teralynx switch ASIC. Custom ASICs for hyperscalers. Smaller but growing networking business.	~$80B	~$2B networking

11. How AGI Impacts Data Center Networking

Let's tie this back to the AGI thesis that frames all our investment analysis.

AGI increases networking demand — a lot

If you believe AGI is coming (and we do), here's what it means for networking:

Training clusters get bigger. Moving from 10K to 100K to 1M GPUs. Each doubling in cluster size roughly triples the networking requirement (because east-west traffic grows faster than linearly).
Inference also needs networking. Large-scale inference runs across multiple GPUs. As AI inference becomes a massive workload, it creates sustained networking demand (not just during training).
Edge AI needs networking to the cloud. Billions of AI-powered devices will send data to and from cloud AI clusters, increasing north-south traffic too.
New data centers need new networks. Every new data center built for AI needs a complete networking fabric. At current build rates (hundreds of new DCs globally), this is a multi-year tailwind.

But NVIDIA wants to own it

The offsetting risk: NVIDIA is trying to vertically integrate from GPU all the way through the network switch. If they succeed, Arista loses the AI networking market to NVIDIA's Spectrum-X. The key question is whether hyperscalers buy NVIDIA's full stack or insist on open Ethernet with vendor choice.

The most likely outcome

History suggests that open standards eventually win over proprietary alternatives, especially when the biggest customers (hyperscalers) have a strong incentive to avoid vendor lock-in. The pattern repeats across tech:

Open Ethernet beat Token Ring (IBM proprietary) in the 1990s
x86 servers beat proprietary UNIX servers in the 2000s
Linux beat proprietary UNIX in the 2010s
Ethernet is now beating InfiniBand in the 2020s (same pattern)

Most likely scenario: Ethernet wins the volume, NVIDIA keeps InfiniBand/Spectrum-X for its premium DGX customers, and Arista captures a large share of the Ethernet AI networking market. The total networking TAM grows fast enough that both Arista and NVIDIA do well, but Arista's growth rate depends on how quickly Ethernet displaces InfiniBand in AI clusters.

12. Key Metrics and Market Sizing

Metric	Value	Source / Context
Total data center switching market (2025)	~$15-18B	Includes Ethernet + InfiniBand, switches only (not optics or cables)
AI backend networking TAM (2025)	~$5-8B	Switches + NICs + optics specifically for AI GPU clusters
AI networking TAM (2028E)	~$15-25B	Growing 30-40% CAGR as AI clusters scale
Arista AI-specific revenue (FY2026 target)	$3.25B	~30% of total projected revenue
NVIDIA networking revenue (FY2025)	~$15B	InfiniBand + Spectrum-X + ConnectX NICs
Broadcom networking revenue (FY2025)	~$15B	Switch ASICs + custom XPUs + NICs
InfiniBand share of AI backend networking	~50-60%	Declining as Ethernet gains share
Ethernet share of AI backend networking	~40-50%	Growing, especially among hyperscalers
Average switch ASP (high-end 400G/800G)	$50-150K	Per switch, depending on port count and speed
Network cost as % of AI cluster	10-15%	Switches + optics + cables + NICs

13. Optical Transceivers — The Other Piece

Every port on a switch needs an optical transceiver — a small module that converts electrical signals to light for transmission over fiber optic cables. These are the components made by Lumentum (LITE) and Coherent (COHR).

As switch speeds increase (400G → 800G → 1.6T), optical transceivers get more complex and expensive. A single 800G transceiver costs $500-1,500. A large AI cluster might need 50,000-100,000 transceivers. At $1,000 each, that's $50-100M just in optics — a significant cost.

The next frontier is co-packaged optics (CPO), where the optical transceiver is integrated directly onto the switch ASIC package rather than being a separate pluggable module. CPO promises lower power consumption and higher density, but it's still in early development. NVIDIA has invested $2B in Lumentum specifically for CPO development. If CPO succeeds, it could change the economics of optical transceivers (fewer discrete modules, but higher ASIC cost).

14. Summary — What This Means for Our Portfolio

Investment Implications

ANET (Arista): The leading merchant Ethernet switch vendor. Benefits enormously if Ethernet wins over InfiniBand for AI. The threat is NVIDIA Spectrum-X taking the AI segment. Our fair value estimate of $400 assumes Arista captures its share of AI networking but doesn't dominate it (NVIDIA takes some). At ~$380, the stock is roughly fairly valued — not a screaming buy, but a high-quality business with a real AI tailwind. Would be more interesting at $250-300.

AVGO (Broadcom): The hidden monopoly. Makes the switch ASICs that go inside Arista's (and most other) switches. Also designing custom AI chips (XPUs) for Google and Meta. Broadcom benefits regardless of who wins the switch vendor battle — they supply the silicon to almost everyone. This is the most picks-and-shovels play in networking.

NVDA (NVIDIA): Owns both InfiniBand and Spectrum-X. Networking is ~6-7% of NVIDIA's revenue but growing fast. NVIDIA's vertical integration strategy (GPU + NIC + switch + software) is the existential threat to Arista and Broadcom in the AI segment. But networking is a small tail on a very large GPU dog — NVIDIA's stock price is driven by GPU demand, not networking.

LITE/COHR (Lumentum, Coherent): Make the optical transceivers that plug into every switch port. They benefit from the raw growth in port count regardless of who makes the switch. The CPO transition could reshape their business model. Both stocks have run up substantially on AI optics demand.

The big picture: AI is creating a massive structural increase in networking demand. The total networking TAM for AI could reach $15-25B by 2028. The question isn't whether the market grows — it's who captures it. Broadcom is probably the safest networking bet (they supply everyone). Arista is a great business but faces the NVIDIA threat. NVIDIA's networking revenue is growing fast but is a small part of their overall story.

Sources: Company filings (ANET, NVDA, AVGO 10-Ks), IEEE 802.3 standards, Ultra Ethernet Consortium specifications, Dell'Oro Group market data, industry analyst reports. Market sizes are approximate. Report date: April 2026.