High Bandwidth Memory

Overview

High Bandwidth Memory (HBM) is a stacked DRAM architecture that provides significantly higher bandwidth than traditional GDDR memory, essential for data-intensive workloads in AI training and inference. The bottleneck arises primarily from supply constraints on HBM3E, the most advanced variant as of 2024, which supports data rates up to 9.6 Gbps per pin and is available in configurations up to 12-high stacks, offering capacities exceeding 36 GB per stack.

The core technical constraint is low manufacturing yields for these 12-layer HBM3E stacks. Stacking involves through-silicon vias (TSVs) and micro-bumps to interconnect multiple DRAM dies vertically, a process prone to defects such as voids, cracks, or alignment errors during thermal compression bonding. Industry reports indicate yields for 12-layer stacks are below 50% in early production, compared to over 80% for simpler 8-layer HBM3, due to increased stacking height amplifying warpage and stress issues.

Compounding this is the dependency on TSMC's Chip-on-Wafer-on-Substrate (CoWoS) packaging. CoWoS integrates HBM stacks around logic dies (e.g., GPUs) on an interposer, a process with limited capacity. HBM must be delivered in precise volumes synchronized with CoWoS schedules, as excess inventory risks degradation and mismatches disrupt yields. TSMC's CoWoS-L and CoWoS-R variants are scaling, but monthly output is capped at around 35,000-40,000 units in 2024, insufficient for AI demand surges.

Why It Matters

This HBM bottleneck disrupts the entire semiconductor supply chain by constraining AI accelerator production, which relies on HBM for competitive performance. NVIDIA's H100 and upcoming Blackwell GPUs, as well as AMD's MI300X and MI325X Instinct accelerators, specify HBM3E for their memory subsystems, with configurations like 192 GB per Blackwell GPU requiring eight 12-layer stacks.

Affected parties include hyperscalers (e.g., Microsoft, Google) procuring these GPUs for AI data centers, facing delays in cluster deployments. NVIDIA and AMD encounter production shortfalls, leading to tight allocations favoring largest customers and inflating system costs—HBM can account for 40-50% of GPU die cost. Chip designers shift to lower-spec HBM3 or GDDR alternatives, compromising bandwidth and performance, as seen in some AMD MI300 variants.

Upstream, it pressures DRAM makers to prioritize HBM over commodity DDR, straining capex for specialized fabs. Downstream, TSMC's packaging queue lengthens, delaying diverse products from Apple silicon to networking ASICs. Broader impacts include slowed AI innovation timelines and increased lead times for servers, with supply chain ripple effects to PCB assemblers and logistics. Without relief, it risks widening the gap between AI compute demand and supply through 2025.

Key Players

SK Hynix holds approximately 50% share of HBM supply in 2024, emerging as the leader through aggressive investments in HBM3E; it supplies the majority of NVIDIA's HBM needs, with facilities in South Korea and Indiana (U.S.) ramping 12-layer production. Samsung, with around 40% share, competes closely, providing HBM to both NVIDIA and AMD; its yields trail SK Hynix slightly, but it benefits from scale in memory stacking. Micron, at 10-15% share, is the North American outlier, qualifying HBM3E for NVIDIA and expanding U.S. production under CHIPS Act funding.

NVIDIA, as the primary consumer, drives ~70% of HBM demand via its data center GPUs; its Blackwell platform alone could consume over half of 2025 HBM output. AMD follows with its MI-series accelerators, securing allocations from Samsung and SK Hynix. TSMC, while not an HBM producer, controls the packaging chokepoint as the sole advanced CoWoS provider for these GPUs, dictating integration schedules.

Beneficiaries include equipment suppliers like Applied Materials and Lam Research for stacking tools, and interposer material providers. Hyperscalers indirectly benefit from prioritized access but face higher costs.

Current Status

The HBM bottleneck is intensifying in late 2024, with demand from NVIDIA Blackwell ramps (expected H2 2024 shipments) and AMD MI325X outpacing supply growth. SK Hynix reports 12-layer HBM3E yields improving to 60% but still limits output; full qualification for NVIDIA occurred in Q3 2024. Samsung and Micron lag in volume, with Micron targeting 10% share by end-2025.

Capacity expansions are underway: SK Hynix plans to triple HBM output to 200,000 wafers/year by 2025 via new lines in Hwaseong and Indiana; Samsung targets similar growth. TSMC is expanding CoWoS to 90,000 units/month by 2025 through CoWoS-L scaling and new SoIC tech, but AI demand forecasts (e.g., 2-3x GPU shipment growth) suggest deficits persist.

No easing is evident short-term; allocations remain NVIDIA-dominant, with AMD reportedly facing shortages. Longer-term, HBM4 development (2026) promises relief via higher densities, but 2025 remains constrained. Uncertainty exists around yield ramps and geopolitical risks to South Korean supply.

High Bandwidth Memory

Overview

Why It Matters

Key Players

Current Status

Source Companies(control or create this constraint)

Affected Companies(impacted by this constraint)

Severity Assessment

Affected Segments

Current Status

Search backplane

Search backplane

High Bandwidth Memory

Overview

Why It Matters

Key Players

Current Status

Source Companies(control or create this constraint)

Affected Companies(impacted by this constraint)

Severity Assessment

Affected Segments

Current Status