Data Center Architecture and Analysis Framework

Data-Center

MECE Framework for Data Center Architecture

A MECE (Mutually Exclusive, Collectively Exhaustive) decomposition rooted in physical-scale progression — drilling from the macro campus level (kilometer-scale) down to micro-components (millimeter-scale). Each layer has well-defined input/output boundaries, and each maps to a distinct investment thesis and due-diligence focus area.


Framework Overview

Key Inter-Layer Boundaries

Boundary Demarcation Criterion
L1 / L2 Outside vs. inside the building shell; the high-voltage grid point-of-interconnection (typically a campus substation) marks the divide
L2 / L3 Mechanical/electrical rooms vs. white space; the CDU interface between the primary-side facility-water loop and the secondary-side technology-cooling loop
L3 / L4 Rack enclosure / rack manifold vs. compute-tray backplane; Scale-out switch ports define the network boundary
L4 / L5 PCB / motherboard vs. packaged die / discrete device; the smallest independently procurable and replaceable BOM unit

L1 — Macro Resources & Campus Layer

Central question: "How large an AI factory can this site support?"

Sub-Dimension Key Components / Resources Typical Scale Primary Bottleneck (2026)
Energy Supply HV grid interconnection (138 / 230 / 345 kV), on-site substation, microgrid (natural-gas CCGT / SMR nuclear PPA) GW-class campuses require 100–300 MVA main transformers Transformer lead times of 2–4 years; grid-interconnection queues of 4–7 years
Water Resources Cooling-water rights, closed-loop make-up water, WUE targets A 1 GW wet-cooled campus consumes ≈ 35 billion gallons per year Hard constraint in arid regions (Arizona, Middle East, Inner Mongolia)
Wide-Area Networking Long-haul dark fiber, submarine-cable landing stations, inter-campus DCI Cross-campus training demands > 400 Tbps Scarcity of viable dark-fiber routes
Civil / Structural Shell 6–9 m clear height, 500+ lbs/sqft floor loading, slab-on-grade, seismic & flood resilience A 1 GW campus spans 400–1,000 acres Structural-steel and concrete supply-chain lead times
Climate & Siting Annual mean temperature, wet-bulb temperature (drives WUE / PUE), renewable-energy resource availability Northern Europe / Pacific Northwest are optimal Climate-advantaged regions often face power and connectivity constraints

Key Investment & Due-Diligence Questions:

  • Has grid interconnection been secured? Is the Interconnection Service Agreement (ISA) executed?
  • Does the site offer expansion optionality (Phase 2 / 3 land reserves)?
  • Are water-allocation quotas sufficient? Has the environmental impact assessment been approved?
  • Can building specifications accommodate next-generation GPUs (6 m+ clear height, 500+ lbs/sqft floor loading)?

L2 — Facility & Mechanical-Electrical Layer

Central question: "How much standardized power and cooling can this facility sustainably deliver?"

Sub-Dimension Key Components Legacy Paradigm AI-Era Evolution
Primary Distribution Chain MV switchgear → transformer → LVMS main distribution board 1–2.5 MVA transformers 3.15 / 5 / 10 MVA+; emerging trend of MV-direct-to-rack distribution
UPS / Energy Storage Centralized double-conversion UPS, VRLA battery rooms, BESS 2N VRLA, 5–10 min runtime Modular lithium-ion; select deployments eliminate UPS entirely (e.g., Microsoft Fairwater)
Emergency Generation Diesel gensets, fuel systems, exhaust routing Diesel N+1, 72-hour fuel reserve Natural-gas CCGT / on-site microgrids; BESS as diesel replacement
Primary-Side Cooling Chillers, cooling towers, dry coolers, large-bore piping Centrifugal chillers + cooling towers Dry-cooler arrays + closed-loop warm-water systems (45 °C supply); chiller elimination
ATS / STS Transfer Utility-to-generator changeover, UPS-to-bypass transfer Mechanical ATS, 60–200 ms transfer Solid-state STS, 2–4 ms transfer; MV-class ATS at ≈ 19 s
BMS / DCIM Building management, environmental monitoring, capacity management BACnet / Modbus layered architecture Digital twin (NVIDIA Omniverse DSX) + AI-driven operations (DeepMind, Phaidra)

Key Investment & Due-Diligence Questions:

  • What are the age and lifecycle status of UPS units, gensets, and chillers?
  • What is the supply-chain obsolescence / spare-parts risk for critical equipment (PLC controllers, breaker models)?
  • Can the cooling infrastructure transition from air-cooled to liquid-cooled (primary-side water temperature, pipe pressure ratings, CDU tie-in points)?
  • What is the maturity of the BMS / DCIM stack (AIOps readiness, data-migration complexity post M&A)?

L3 — Cluster & Rack Layer (Pod)

Central question: "How does a rack row (Pod) convert L2's standardized power and cooling into GPU-ready supply?"

Sub-Dimension Key Components Legacy Paradigm AI-Era Evolution
Rack-Level Power Distribution Busway, rack PDU, sidecar power shelf 800–2,500 A busway, 54 VDC 3,000–6,300 A busway, 800 VDC HVDC, sidecar rectifiers
Rack-Level Backup Power Rack BBU, short-duration energy storage Rare Rack BBU as standard (dampens GPU synchronous switching transients)
Secondary-Side Liquid Cooling CDU, manifold, QD (quick-disconnect) fittings, make-up water unit, leak detection Non-existent / optional DLC as standard; 1.5 L/min/kW flow rate; 200+ QDs per rack
Scale-Out Networking ToR switch, spine / core switches, fiber tray, MMR 25 / 100 GbE, copper + limited fiber 400G / 800G / 1.6T InfiniBand or Ethernet; CPO evolution
Rack Enclosure 19" / 21" rack, doors, cable management 42U, USD 2–5K Oberon / Kyber / ORv3 chassis; USD 50–200K per empty enclosure

The Subtle L3 / L4 Boundary

  • Scale-up network (intra-GPU-domain): NVLink copper backplane physically resides in L4 (inside the rack); however, the NVSwitch tray — as a standalone unit — straddles the L3/L4 boundary.
  • Scale-out network (inter-rack, inter-Pod): Fiber and switches above the ToR clearly belong to L3.
  • Dividing line: The OSFP / QSFP port on the compute tray — inside the port is L4; outside the port is L3.

Key Investment & Due-Diligence Questions:

  • Does the busway ampacity support next-generation density (120–600 kW per rack)?
  • Does the liquid-cooling system meet the availability requirements of AI training workloads (leak detection, redundancy design)?
  • Does Scale-out bandwidth prevent GPU under-utilization (risk of 33% idle loss)?
  • Rack compatibility: which of Oberon / ORv3 / Kyber are supported? What is the retrofit cost?

L4 — Node & Server Layer

Central question: "How is the compute capacity of a single tray / server constituted?"

Sub-Dimension Key Components 2022 Baseline 2026 Frontier
Core Compute GPU / AI accelerator, host CPU H100 (700 W), Xeon / EPYC B300 (1,400 W), Rubin (2.3 kW), Grace ARM host
Scale-Up Network NVLink copper backplane, NVSwitch NVLink 4.0 / 900 GB/s NVLink 6 / 3.6 TB/s, Kyber vertical backplane
Near-Node Storage HBM (on-package), NVMe SSD, E1.S HBM3, 80 GB HBM3e 288 GB, HBM4 1 TB
I/O & DPU SmartNIC, DPU, PCIe bus ConnectX-7 (400G), BlueField-3 ConnectX-9 (1.6T), BlueField-4
Board-Level Thermal Cold plate, heat sink, fans Air-cooled + partial liquid cooling D2C cold plate, 100% liquid-cooled, 100 μm micro-channel
Board-Level Power VRM / power IC, on-board BBU 12 V bus, multi-stage DC-DC 800 V → 12 V single-stage 64:1 LLC, GaN / SiC devices

Key Investment & Due-Diligence Questions:

  • What is the liquid-cooling readiness of the server tray (full liquid vs. hybrid)?
  • What is the NVLink domain size (NVL8 → NVL72 → NVL576)?
  • Is HBM supply locked in? What is the hedging mix across SK Hynix / Samsung / Micron?
  • What is the OEM certification status of cold plates and quick-disconnect fittings?

L5 — Micro-Component Layer (Vertical Drill-Down)

Governing principle: L5 is not a flat layer but rather a sub-tree that can be drilled into from any module in L3 or L4. Stopping rule: Drill down until the level at which an independently investable public-market target exists — typically 3–4 levels deep.


L5-A: Optical Transceivers (Drill-Down from L3 Scale-Out Network)

L3  Scale-Out Network
 └─ L4  Optical Transceiver (400G / 800G / 1.6T / 3.2T OSFP-XD)
       ├─ Optical Die:  Laser (VCSEL / EML / CW SiPh), Photodetector, Modulator
       ├─ Electrical Die:  DSP, Driver, TIA, CDR
       ├─ Optical Packaging:  MPO connector, lens, waveguide
       └─ Evolution:  CPO (Co-Packaged Optics), LPO (Linear-Drive Pluggable Optics)

L5-B: Liquid-Cooling System (Drill-Down from L3 Secondary-Side Cooling)

L3  Secondary-Side Liquid-Cooling Loop
 ├─ CDU (Coolant Distribution Unit)
 │     ├─ Plate heat exchanger
 │     ├─ Variable-frequency pump package
 │     ├─ Sensors / filters / degassing unit
 │     └─ Control PLC
 ├─ Manifold
 ├─ QD Quick-Disconnect Fittings (200+ per rack)
 ├─ Cold Plate
 │     ├─ Copper micro-channel machining (100 μm precision)
 │     ├─ Sealing / anti-corrosion coating
 │     └─ Blind-mate / side-entry structure
 └─ Coolant (treated water, dielectric fluid, two-phase refrigerant)

L5-C: HBM (Drill-Down from L4 Near-Node Storage)

L4  HBM Stack
 ├─ DRAM core die (8 / 12 / 16 layers)
 ├─ Logic base die
 ├─ TSV (Through-Silicon Via)
 ├─ Micro-bump / Hybrid Bonding
 └─ Process:  MR-MUF (SK Hynix), TC-NCF (Samsung)

L5-D: Power Components (Drill-Down from L2–L3 Distribution Chain)

L3  800 VDC Rack Power
 ├─ Rectifier (AC → 800 VDC)
 │     ├─ SiC power devices
 │     └─ Digital control IC
 ├─ 800 V → 12 V LLC Converter (64:1)
 │     ├─ GaN power devices
 │     ├─ Magnetic components
 │     └─ Controller IC
 ├─ Rack BBU
 │     ├─ Li-ion cells (LFP / NMC)
 │     └─ BMS / protection board
 └─ Busway

L5-E: Connectors & Copper Interconnects (Drill-Down from L3/L4 Boundary)

L4  NVLink Copper Backplane  +  L3  Scale-Out Copper Interconnects
 ├─ High-speed copper cable (112G / 224G SerDes)
 ├─ Connector (Paladin HD, high-density OSFP)
 ├─ PCB laminate (Megtron-7 / 8)
 └─ NVLink Spine Cartridge

Electrical Architecture & Power Density Evolution

From Dual-Utility-Feed 2N / 2(N+1) System-Level Redundancy

Utility normal ────────┬─── Utility outage ──────────────────────────→ Time
                       │
                       │ 0 ms ────── 10–30 s ────── Hours ────── Days
                       │   ↑              ↑              ↑
                       │  UPS assumes     Genset starts   Genset sustains
                       │  load seamlessly  and runs until  operations
                       │                   utility returns
  • UPS covers 0 seconds to 10–30 seconds. Its mission is to bridge the gap seamlessly — the load perceives zero interruption when the utility drops. However, its batteries last only 5–15 minutes; beyond that, they are depleted.
  • Diesel generator covers tens of seconds onward through days. Its mission is long-duration utility substitution.
  • Utility feed is the primary power source under normal conditions.

Per Schneider Electric White Paper 75, availability ranks as follows: N (standalone) < isolated redundant < parallel redundant (N+1) < distributed redundant < 2N / 2(N+1) system-level redundant (highest). An N+1 system achieves roughly 99.98–99.99% availability; a 2N system reaches 99.995%+; and a 2(N+1) configuration approaches 99.9999% (six nines). Uptime Institute surveys consistently show that power-related failures top the list of severe / major outage causes, with UPS and distribution faults accounting for roughly 30–40% of all downtime events.

N = the minimum number of units required to serve the full load.

Suppose a facility needs 4 × 500 kW UPS to carry 2 MW of load — then N = 4.

Tier Unit Count Meaning Analogy
N 4 Exactly sufficient; zero redundancy 4 people carrying 4 sacks of rice — lose one and the whole thing collapses
N+1 5 1 standby unit added 5 people carrying 4 sacks — one can fall and the team still copes
2N 4 + 4 = 8 An entire duplicate system Two independent squads of 4, mirroring each other
2(N+1) (4+1) + (4+1) = 10 Two systems, each with its own +1 spare Two squads of 5 (each with 1 backup)

2N: Dual-System Mirroring

Two fully independent power chains (Path A + Path B), each independently capable of carrying 100% of the load.

Utility A → Transformer A → UPS Bank A (4 units) → PDU-A ┐
                                                          ├──→ Dual-corded servers (two PSUs)
Utility B → Transformer B → UPS Bank B (4 units) → PDU-B ┘
  • During normal operation, each path carries 50% of the load (leaving 50% headroom per path).
  • If an entire path fails — including its utility feed, transformer, UPS bank, and distribution panel — the other path instantly assumes 100%.
  • Servers are dual-corded (dual PSUs) and handle the switchover internally.

Key attribute: tolerates full-chain failure, including human error (tripping the wrong A-path breaker), fire, flooding, or any event that renders an entire path inoperable.

2(N+1): Dual-System Mirroring with Per-Path +1 Redundancy

On top of 2N, each path adds one standby UPS internally.

Utility A → Transformer A → UPS Bank A (4+1 = 5 units) → PDU-A ┐
                                                                 ├──→ Dual-corded servers
Utility B → Transformer B → UPS Bank B (4+1 = 5 units) → PDU-B ┘
  • If 1 UPS fails within Path A, Path A self-heals internally (the remaining 4 units still carry full load).
  • If all of Path A goes down, Path B takes over.
  • If 1 UPS fails in Path A and simultaneously 1 UPS fails in Path B, the system remains unaffected.

The Critical Distinction

2N's hidden vulnerability: When Path A is taken offline for maintenance, Path B effectively degrades to N (no redundancy). At that point, a single UPS failure on Path B triggers a facility-wide outage.

2(N+1) solves exactly this problem: With Path A offline, Path B still operates as N+1 (redundant) — a double layer of insurance.

Real-world operations example — annual UPS maintenance on Path A:

  • 2N system: Path A must be de-energized; Path B now shoulders the entire load with zero redundancy. If any single UPS on Path B fails during the window → full facility outage. The ops team spends the night on high alert.
  • 2(N+1) system: Path A de-energized; Path B remains N+1. A single UPS failure on Path B is absorbed internally — the facility is unaffected. The ops team operates with composure.

This is why financial-core data centers and hyperscaler mission-critical clusters typically deploy 2(N+1), while standard enterprise Tier IV facilities consider 2N sufficient.

Cost Perspective

Using a 2 MW facility's UPS as an example (each unit 500 kW, ≈ RMB 800K / ≈ USD 110K):

Configuration UPS Count UPS CapEx Multiple of N
N 4 units RMB 3.2M 1.0×
N+1 5 units RMB 4.0M 1.25×
2N 8 units RMB 6.4M 2.0×
2(N+1) 10 units RMB 8.0M 2.5×

Note: This covers UPS hardware alone. Switchgear, batteries, floor space, and cooling must also scale accordingly, so total 2N investment runs roughly 2.2–2.5× that of N, and 2(N+1) roughly 2.7–3×.

Standards & Availability Benchmarks

China's GB 50174-2017 (Code for Design of Data Centers) classifies facilities into Grades A, B, and C. Grade A (fault-tolerant) requires dual utility feeds, 2N-redundant architecture, diesel-generator backup, and UPS battery runtime of no less than 15 minutes. Grade B (redundant) recommends dual utility feeds with N+1 UPS redundancy. Grade C (basic) permits single-feed power. Field measurements show that domestic Grade-A facilities take approximately 19 seconds for the 10 kV HV ATS to transfer from utility to genset and roughly 16 seconds for the return transfer.

The Uptime Institute Tier system defines clear availability-to-downtime mappings:

Uptime Institute's 2024–2025 survey reports that 53% of operators experienced downtime between 2021 and 2024, though only 9% qualified as "severe" (a historic low). Human error was a contributing factor in 66–80% of incidents, with 58% attributable to failure to follow standard operating procedures. 54% of respondents reported that their most recent major outage cost exceeded $100K, and roughly 20% exceeded $1M.

Tier IV facilities cost approximately twice as much to build as Tier III. Per-rack lifecycle TCO runs around $120,000 (CapEx and OpEx each accounting for roughly 50%). The core decision calculus: at a downtime cost of $8,000–$15,000 per minute, determine whether the incremental redundancy investment of Tier IV is justified by the avoided outage losses.


Battery Energy Storage Systems (BESS): Lithium-Ion Dominance & Emerging Technologies

0 ms ──── 30 s ──── 5–15 min ──── Hours ──── Days
   UPS interval         Legacy UPS limit       Genset interval

Traditional VRLA batteries cannot perform the genset's job because of two hard constraints:

  • Low energy density: 35–40 Wh/kg — storing hours of energy would require a battery room larger than the data hall itself.
  • Short cycle life: 500–1,200 cycles — daily charge/discharge for peak-shaving would exhaust the batteries within 2 years.

LFP lithium-ion technology shatters both constraints:

  • Energy density 4–5× greater (140–190 Wh/kg)
  • Cycle life 5–10× longer (3,000–5,000+ cycles)
  • Price collapse ($108/kWh — already below the full-lifecycle cost of a diesel-genset system)

The implication: for the same footprint, stored energy jumps from "5 minutes" to "80 minutes or even several hours" — landing squarely in the time window that diesel gensets previously monopolized.

The New Three-Tier Functional Allocation

Time Horizon Legacy Solution Lithium-Ion BESS Solution
0–30 s UPS (VRLA) BESS
30 s – 80 min Diesel genset BESS (same system continues to supply load)
80 min – Days Diesel genset Diesel genset / gas turbine / utility restoration

Microsoft's Stackbo deployment — a 16 MWh / 24 MW system — embodies this logic: 80 minutes at full load means:

  • Sub-50 ms response (replacing the UPS)
  • Full-load sustain for 80 minutes (covering the genset's near-term role)
  • Utility restoration probability within 80 minutes is > 95% (Nordic grid statistics), drastically reducing genset dispatch frequency

BESS can respond within 50 milliseconds of an outage — more than sufficient to bridge the 10–30 second gap before generators start. More consequentially, BESS is evolving toward diesel-genset displacement: Microsoft deployed 16 MWh of BESS (24 MW peak) at Stackbo, Sweden, delivering 80 minutes of full-load backup; Google installed 2.75 MW / 5.5 MWh batteries at St. Ghislain, Belgium, as a diesel-genset replacement; and as of 2025, Google has deployed more than 100 million lithium battery cells (hundreds of MWh) across its global data center fleet.

BESS as an Active Revenue Generator — The Truly Disruptive Element

A diesel genset sits idle 99% of the year, runs a monthly test cycle, and is a pure cost center. Lithium-ion BESS works for its keep between outages — this is why the ROI comes in at 3–8 years. It is not "buying insurance" but rather "buying insurance that moonlights as a profit center." Diesel has no equivalent attribute.

  • Peak shaving / valley filling: Charge during off-peak tariff windows, discharge during peak hours — saving 10–20% on annual electricity costs.
  • Demand-charge management: Avoid triggering the grid's peak-demand billing tier — a single avoidance event can save millions of dollars.
  • Grid frequency regulation (FCR / FFR): Participate in ancillary-service markets, earning revenue by responding to grid frequency deviations within seconds.
  • Energy arbitrage: Intraday electricity price differentials reach up to 10× — pure arbitrage income.
  • Renewable integration: Store excess midday solar generation for evening dispatch.

The more precise framing, therefore, is not "lithium batteries serve as both UPS and genset" but rather: lithium-ion BESS absorbs 100% of the UPS function + the genset's short-to-medium-duration (< 2 hour) backup function + adds an entirely new energy-management revenue stream. The diesel genset is compressed to a backstop role for long-duration outages (> 2 hours) only.

Analogy:

  • Legacy model: Bodyguard (UPS) + long-term stand-in (diesel genset)
  • New model: Swiss-army-knife bodyguard (BESS — also moonlights as a financial advisor between incidents) + emergency reserve (diesel / gas, deployed only in extreme scenarios)

The Most Aggressive Architectures — Two Emerging Scenarios

1. AI Training Clusters (Microsoft Fairwater et al.)

  • UPS eliminated entirely (training workloads can checkpoint and resume)
  • Minimal BESS retained for sub-second protection
  • Diesel or gas turbine provides long-duration backstop
  • Overall backup-power investment drops substantially

2. Self-Powered Hyperscale Campuses (xAI Memphis)

  • On-site combined-cycle gas power plant (grid-independent)
  • BESS handles frequency regulation and short-duration backup
  • Diesel gensets replaced by gas turbines
  • Essentially: "the campus becomes its own utility"

GPU Compute Hardware Is Rewriting Power Density Rules

Average data center rack power density evolved from ~4 kW in 2011 to ~12 kW by 2024 (AFCOM data), but AI accelerators are pushing high-end requirements into an entirely different order of magnitude. NVIDIA GPU power draw is scaling aggressively across generations: A100 (400 W) → H100 (700 W) → B200 (1,000–1,200 W) → GB200 Superchip (2,700 W for 2 GPUs + 1 Grace CPU).

NVIDIA GB200 NVL72 is the current benchmark for high-density systems: a single rack integrates 72 Blackwell GPUs + 36 Grace CPUs with 13.5 TB HBM3e memory, delivering 1.44 exaFLOPS (sparse FP4), consuming 120–140 kW per rack, and weighing 1.36 metric tons. Coolant enters at 2 L/s (25 °C) and exits with a ≈ 20 °C temperature rise. GPUs and CPUs are liquid-cooled; NICs and storage are still air-cooled by 40 mm fans. NVIDIA's next-generation Vera Rubin NVL144 (2026) is projected to push per-rack power to ~600 kW.

The Fundamental Equation Behind Every Distribution Challenge

All data center power-distribution challenges reduce to one identity: P = U × I

  • P = Power (watts)
  • U = Voltage (volts)
  • I = Current (amps)

At a given power level, lower voltage means higher current. And high current introduces three compounding problems:

Problem 1: Conductor cross-section becomes physically untenable

Higher current demands thicker copper; cost and weight scale non-linearly. At 48 V, the GB200 NVL72 requires 3,800 amps — copper busbars of that rating physically cannot fit inside a standard rack.

Current Typical Conductor Unit Weight Diameter
30 A Residential wiring Light A few mm
350 A 500 MCM cable Several kg/m Thumb-thick
3,800 A Massive copper busbar Tens of kg/m Brick-sized

Problem 2: I²R losses scale with the square of current

Resistive heating = I² × R. Double the current and losses quadruple. At 3,800 A flowing through any busbar segment, the heat generated would be sufficient to cook the rack.

Problem 3: Voltage drop (IR drop) spirals out of control

Higher current over longer runs produces severe voltage sag. At 3,800 A flowing through 1 meter of busbar, the drop can reach several volts — on a 12 V bus, this means the chip receives unstable voltage, directly impairing computational reliability.

These three effects combine to form the physical basis for Schneider Electric's widely cited conclusion: "400 V three-phase AC and 48 VDC solutions become strained at 200 kW/rack and are entirely infeasible at 400 kW/rack." This is not a matter of engineering effort — it is a matter of physics.

Comparative Impact at 140 kW — Same Power, Different Voltages

Voltage Architecture Required Current Cable Specification Approx. Unit Cost Copper Usage
48 VDC ~2,900 A Massive copper busbar Extremely high Baseline
±400 VDC ~350 A 500 MCM cable $14/ft −45%
800 VDC ~175 A 3/0 AWG cable $5/ft Halved again

Each voltage doubling halves the current, halves the copper, and cuts losses to one-quarter. Going from 48 V to 800 V reduces current by 16× and theoretical losses by 256× — this is the core attraction of high-voltage DC.

In other words, the voltage-architecture upgrade is not "performance optimization" — it is a physical necessity. The evolutionary path: Legacy AC → 48 V → ±400 V → 800 V.


Generation 1: Legacy 480 V AC + 12 V DC (Power Density < 30 kW/rack)

MV AC → 480 V / 415 V AC → UPS → PDU → Server PSU (AC → 12 V DC) → Motherboard

Each server carries its own PSU performing AC-to-12 V DC conversion. This works fine below 30 kW per rack.

Generation 2: 48 V DC Rack-Level Architecture (30–100 kW/rack)

In 2016, Google contributed the specification to OCP (Open Compute Project). The key change: relocating AC-to-DC conversion from individual servers to the rack level.

480 V AC → Rack-level rectifier → 48 V DC bus → Per-server DC-DC step-down to 12 V

Benefits:

  • Eliminates redundant conversion losses from dozens of per-server PSUs
  • 48 V delivers 30%+ lower distribution losses than 12 V (at the same power, current drops to 1/4 and I²R losses to 1/16)
  • 48 V remains classified as Extra-Low Voltage (ELV) (< 60 V) — no specialized electrical licensing required; a low safety-compliance threshold

Open Rack v3 standardized this as the de facto OCP power architecture. But the GB200 immediately pushed this scheme to its limits — a 120 kW rack at 48 V requires busbars rated at 2,900 A minimum, which is already physically marginal.

Generation 3: ±400 V DC (100–400 kW/rack)

±400 V does not mean the voltage oscillates between +400 V and −400 V (that would be AC). Its actual meaning is a three-wire system:

+400 V ────────────┐
                   │
   0 V ────────────┤  ← Neutral / reference conductor (optional)
                   │
−400 V ────────────┘
  • One positive bus (+400 V) at 400 V above ground
  • One neutral conductor (0 V) serving as reference
  • One negative bus (−400 V) at 400 V below ground

Transitional Advantage 1: 800 V Working Voltage, but Only 400 V to Ground

Electrical safety standards are based on voltage-to-ground, not line-to-line voltage. A ±400 V system delivers:

  • Equipment working voltage: 800 V (line-to-line) → enjoys the full benefit of high-voltage / low-current operation (copper halved)
  • Voltage-to-ground: only 400 V (each conductor to ground) → insulation requirements, protection design, and safety thresholds equivalent to a 400 V system

Insulation costs and current-related safety expenses are substantially lower than a unipolar 800 V system.

Transitional Advantage 2: Multi-Voltage Supply from a Single Bus

  • High-power loads (GPU compute units): tap across +400 V and −400 V for the full 800 V
  • Medium-power loads (fans, control boards): tap +400 V to 0 V, or 0 V to −400 V, for 400 V half-bus

A single distribution system delivers two voltage tiers simultaneously — greatly increasing flexibility.

Transitional Advantage 3: Fault Tolerance

If the +400 V bus faults, the −400 V side can continue to operate independently (at reduced power). A unipolar 800 V system loses everything when a single conductor fails.

Transitional Advantage 4: Neutral-Current Cancellation

When the positive and negative loads are balanced, the currents on the two buses are equal and opposite, so the net neutral-wire current approaches zero. This allows the neutral conductor to be significantly thinner than the main buses — or, in some configurations, eliminated entirely — saving additional copper.

NVIDIA's published data: 45% reduction in copper usage relative to a 48 V architecture at equivalent power levels.

Challenges

  • Enters the Low Voltage (LV) regulatory domain (> 60 V), requiring licensed electricians and more stringent insulation, clearance, and isolation design.
  • DC arc extinguishment is inherently harder than AC (AC naturally crosses zero 100 times per second, extinguishing arcs automatically; DC arcs burn continuously) — dedicated DC circuit breakers are required.
  • Electrical-shock risk increases materially, necessitating redesigned grounding and protection schemes.

Generation 4: 800 V DC (400 kW – 1 MW/rack)

NVIDIA's Vera Rubin platform moves directly to 800 V DC — this is the architecture prepared for the 2026–2027 GPU generation. At 800 V, 140 kW requires only 175 A, served by a single 3/0 AWG cable (roughly the diameter of kitchen-appliance wiring).

Why the jump from ±400 V to 800 V? Because post-Rubin generations (Feynman?) may push per-rack power directly to 600 kW – 1 MW, at which point even ±400 V becomes insufficient.

Where does 800 V DC come from? The electric vehicle industry. Tesla, Porsche Taycan, and Hyundai's E-GMP platform all adopted 800 V architectures to reduce charging times and copper mass. AI data centers are essentially borrowing a decade of EV electrical-engineering maturity — the same physics (high power, constrained space, copper cost) yields the same solution.

A Key Architectural Shift — The Integrated Power Shelf

Legacy server racks: Each 1U server carries 1–2 PSUs; a single rack may contain 80 PSUs, each independently performing AC-to-DC conversion.

GB200 NVL72 architecture:

  • The entire rack is powered by a small number of centralized power shelves performing AC-to-DC conversion
  • Rack-level BBU (Battery Backup Unit): Lithium-ion modules installed directly within the rack or at the end of the rack row
  • Converted DC is distributed via a busbar — a thick copper bar running the length of the rack — to all compute and switch trays
  • Individual compute nodes no longer carry their own PSUs; they draw power directly from the busbar

Benefits:

  • Higher conversion efficiency (large power supplies outperform small ones)
  • Simplified maintenance (power shelves are hot-swappable; a single-shelf failure does not affect others)
  • Recovered rack space (PSUs are no longer distributed across every server)
  • Combined with DC distribution, whole-rack efficiency improves from ~85% to ~95%+

Investment Map Across the Electrical Architecture Stack

L2 Facility-Level Electrical Systems → Traditional Electrical-Equipment Majors' Home Turf

  • Schneider, Vertiv, Eaton, ABB, Huawei Digital Power
  • Barriers to entry: engineering capability, safety certifications, global service networks
  • Business model: project-based + recurring service fees

L3 Rack-Level Energy Storage (L2 Function Pushed Down to L3) → Rack BBU

The core catalyst is NVIDIA's designation of rack BBU as standard equipment on GB200 / GB300 NVL72, and the power-density step-up creating a standalone market for in-rack energy-storage modules.

Li-ion cell → BBU module → BBU system integration → Rack OEM integration → End customer
  (Cell)       (Pack + BMS)  (incl. PCS, controls)   (ODM / server vendor)   (Hyperscaler)

The greatest beneficiaries in this chain are upstream cell manufacturers and power / BBU system integrators, because:

  • Per-rack BBU value is $20K–$50K — a 5–10× uplift over traditional UPS per-rack value
  • Penetration is starting from near-zero (GB200 has only just entered volume production); the next 3–5 years represent the penetration-rate inflection
  • The technology moat lies in "high density + high safety + liquid-cooling compatibility," not commodity cell competition

L1 Self-Build Power Generation → Energy Companies' Home Turf

The core catalyst is North American grid-interconnection queues of 5–7 years and AI training campuses' demand for 100–500 MW-class power, forcing the emergence of "build your own power plant" models.

Gas turbine / reciprocating engine → BOP auxiliaries → EPC → Gas supply → Data center operator
     (Core prime mover)               (Support systems)  (Constructor)  (Upstream energy)

Gas turbines (scarcest — order books are booked out through 2028+):

  • GE Vernova (GEV): Global gas-turbine leader; exceptional order visibility; primary supplier for xAI Memphis, Meta, and other AI campuses
  • Siemens Energy (ENR.DE): European counterpart; equally saturated order book
  • Mitsubishi Heavy Industries (7011.T): Japanese competitor; H-class large-frame turbines are credible contenders

The Convergence: Two Emerging Verticals Funneling into the Same Company Profiles

Both the "L2 function push-down to L3" and the "L1 self-build push-up" trends ultimately converge on the same class of companies: Vertiv (VRT), Schneider Electric (SU.PA), Eaton (ETN), ABB (ABBN.SW).

These four are the full-stack integrators of data center mechanical-electrical systems:

  • L2 legacy business (UPS, switchgear, cooling) → stable cash-flow engine
  • L3 push-down business (rack BBU, CDU, busbar) → growth engine
  • L1 push-up business (microgrid integration, MV distribution) → new revenue vector

Vertiv (VRT) is the purest embodiment of this thesis over the past three years — spanning liquid cooling to BBU to MV distribution — the quintessential "picks-and-shovels" play for AI data center M&E.


Investment Lens on the Power Density Evolution

Three Generations Side by Side

Architecture Era Per-Rack Power Role
48 V DC 2016–2024 < 100 kW Previous-generation mainstream
±400 V DC 2025–2027 100–400 kW Transitional
800 V DC 2027+ 400 kW – 1 MW Next-generation target

±400 V sits in the middle — it is neither the technological endpoint (unipolar 800 V is cleaner) nor a greenfield architecture (it leverages a large body of mature low-voltage AC engineering). This posture — "one step forward but not quite all the way across" — is the hallmark of a transitional technology.

If the supply chain were ready, theory would dictate going straight to unipolar 800 V — simpler, more copper-efficient, and more headroom for future scaling. In practice, that leap is not yet possible, for reasons distributed across four layers.

Layer 1: Generational Thresholds in Electrical Safety Standards

Voltage is not a continuous variable in engineering standards — it operates as a stepped classification:

Category Range Regulatory Implications
Extra-Low Voltage (ELV) < 60 V DC Virtually no safety constraints; accessible by untrained personnel
Low Voltage (LV) 60 V – 1,500 V DC Requires licensed electricians, insulation-class requirements, mandatory grounding protection
High Voltage (HV) > 1,500 V DC Entirely separate engineering codes and dedicated equipment

48 V falls within the ELV regime; 800 V falls within the LV regime — between these two domains lies a full chasm of regulation, training, and construction codes.

A direct jump from 48 V to 800 V would require:

  • Industry-wide electrician retraining
  • A complete rewrite of insulation, clearance, and grounding standards
  • Updates to IEC, UL, GB, and other safety-certification frameworks
  • Reassessment of risk pricing by insurance underwriters

None of this can be accomplished in one or two years. ±400 V sits in the lower portion of the LV domain, allowing the industry to acclimate to the "LV DC" paradigm while safety standards mature incrementally.

Layer 2: DC Circuit-Breaker Maturity

This is the hardest engineering bottleneck.

AC arc extinguishment comes "free"; DC arc extinguishment is fundamentally difficult. AC current passes through zero 100 times per second (at 50 Hz), so a breaker merely needs to open at the zero crossing and the arc self-extinguishes. DC has no zero crossing — once an arc strikes, it burns continuously until its energy is dissipated. Higher voltage means higher arc energy and harder extinguishment.

DC circuit-breaker maturity stratified by voltage:

Voltage Level DC Breaker Maturity Commercialization Status
48 V DC Fully mature A few dollars each; ubiquitous
400 V DC Mature (borrowed from solar PV and EV) In commercial use 5+ years
800 V DC Recently commercialized Sourced from latest EV platforms; expensive, limited selection
1,500 V DC Solar-PV domain only No rack-level products exist

The advantage of ±400 V: each bus conductor is 400 V to ground, so the system can use existing, proven 400 V DC breakers — devices that have been validated at scale in the solar-PV industry (string voltages typically 600–1,000 V, module-level voltages 400–500 V) and the EV industry (800 V platforms are in fact ±400 V architectures). Pricing is already competitive.

A direct move to unipolar 800 V would require breakers rated for 800 V to ground — devices that are currently scarce, expensive, and unproven in large-scale data center environments.

Layer 3: Voltage Withstand of Server-Side Power Semiconductors

This layer is often overlooked in analysis. The power shelf in the rack and the DC-DC converters inside the server ultimately need to step the high-voltage DC bus down to < 1 V for the die. The critical devices in this step-down chain are power semiconductors (MOSFETs, SiC, GaN), each with a rated voltage ceiling.

Device Mainstream Voltage Rating Data Center Application
Silicon MOSFET 100 V – 600 V Workhorse for 48 V / 400 V systems
SiC MOSFET 650 V / 1,200 V / 1,700 V Core device for 400 V / 800 V systems
GaN HEMT 100 V – 650 V High-frequency applications; complements SiC

1,200 V SiC power devices have only reached true volume production with acceptable pricing in the past two years. For a ±400 V power shelf, a 1,200 V SiC is sufficient (800 V line-to-line + safety margin ≈ 1,200 V).

A unipolar 800 V system, however, presents 800 V to ground; transient voltages at certain internal nodes can spike to 1,500 V+, necessitating 1,700 V or even 2,000 V SiC — devices that only began shipping in volume in 2025, with yields and costs still unfavorable.

Deploying ±400 V first to build market volume, then allowing 1,700 V SiC a 2–3 year maturation window — this is the industry's actual cadence.

Layer 4: Electrician Skill Sets & Construction Codes

This layer is "soft" but critical. The global data center construction workforce has spent the past 30 years on low-voltage AC distribution (380 V / 480 V); 48 V DC is manageable because of its low voltage. Suddenly asking these teams to install 800 V DC systems means:

  • AC conductor-sizing rules no longer apply
  • Grounding systems must be redesigned (IT / TT / TN grounding schemes have different implications under DC)
  • Maintenance safety procedures change fundamentally (DC cannot rely on "wait for zero crossing before opening")
  • Fault-diagnostic instruments and methodologies must be replaced entirely

±400 V, in many construction details, borrows from 480 V three-phase AC conventions (conductor gauge, grounding topology, and insulation class are nearly aligned), making the learning curve for construction crews significantly gentler. This is an advantage that unipolar 800 V does not share.

The Strategic Conclusion

±400 V is a near-perfect compromise — capable of supporting the current GB200 / GB300 generation (120–200 kW/rack) while also handling the next GPU generation (300–400 kW/rack), extending infrastructure useful life to 5–7 years. By the time per-rack power pushes past 600 kW in 2027–2028, 800 V standards, SiC devices, and construction codes will all have matured — making the 800 V transition a natural progression.

But if ±400 V Is Merely Transitional — Investment Implications

  • ±400 V-specific equipment, chips, and cabling will enjoy a 3–5 year sweet spot, but the ceiling is capped — capacity and pricing will be compressed by the post-2027 wave of 800 V adoption.
  • The true long-term beneficiaries are "voltage-agnostic" suppliers — companies whose power shelves, busbars, breakers, and SiC devices work across both 400 V and 800 V, harvesting the transitional dividend and seamlessly pivoting into the 800 V era.
  • Pure 48 V players will be rapidly displaced: If a company's core product for the past decade has been 48 V power shelves, it has roughly 2 years to pivot or face marginalization.

The Most Telling Signal Points

  • NVIDIA's own roadmap is the clearest tell: GB200 ships with ±400 V; Vera Rubin moves directly to 800 V — an explicit signal to the supply chain that the transitional window is roughly 2 years.
  • Schneider, Vertiv, and Delta Electronics are all developing dual product lines (±400 V and 800 V simultaneously), not betting on the transitional architecture itself.
  • SiC leaders (Wolfspeed, Infineon, ROHM) are concentrating capital investment on 1,700 V capacity rather than 1,200 V — they are positioning for the 800 V era.

Network Interconnects & Modular Architecture Trends

Spine-Leaf Architecture as the De Facto Standard

1. North-South vs. East-West Traffic

North-South traffic: Traffic flowing between external users and the data center. For example, a user opening a web application — the request travels from the phone into the data center, locates the server, and returns the page. This is north-south.

East-West traffic: Traffic flowing among servers within the data center. For example, a web application server querying a database, calling a recommendation service, and reading a Redis cache — all of these exchanges occur entirely inside the data center.

The evolution of traffic ratios:

Era North-South East-West Dominant Workload
2000s ~80% ~20% Monolithic applications, static web pages
2010s ~50% ~50% Virtualization, early microservices
Modern (2020s) < 30% > 70% Microservices, distributed databases
AI training clusters < 5% > 95% Inter-GPU gradient synchronization

Why the dramatic shift? Because application architecture moved from monolithic to distributed. A single user request may traverse 50 microservices, query 10 databases, and hit a cache 100 times — all east-west traffic.

In AI training the ratio becomes even more extreme — training a model like GPT-4 generates 95%+ network traffic as inter-GPU gradient-synchronization communication, with near-zero north-south traffic.

This shift single-handedly killed the traditional three-tier architecture — a topology purpose-built to optimize north-south flows.

2. Why the Traditional Three-Tier Architecture Failed

The legacy data center network was a tree-structured Core → Aggregation → Access hierarchy:

           ┌──────────┐
           │   Core   │  ← Core layer (2–4 very large switches at the apex)
           └────┬─────┘
                │
        ┌───────┴────────┐
        │                │
   ┌────┴────┐      ┌────┴────┐
   │ Aggreg  │      │ Aggreg  │  ← Aggregation layer
   └────┬────┘      └────┬────┘
        │                │
    ┌───┴───┐        ┌───┴───┐
    │Access │        │Access │  ← Access layer (server-facing)
    └───────┘        └───────┘

This architecture has three fatal shortcomings:

Problem 1: East-west traffic is forced to "take the long way around"

Server A attached to the left access switch wants to reach Server B on the right — traffic must climb all the way to the core and back down to the opposite access switch, traversing 4–6 hops minimum. Each hop adds several microseconds of latency.

Problem 2: Severe bandwidth oversubscription

An access switch may have 48 × 10G = 480 Gbps of downlink capacity, but only 4 × 10G = 40 Gbps of uplink — a 12:1 oversubscription ratio. This was acceptable when north-south traffic dominated (user requests never saturated 480G simultaneously), but east-west bursts instantly congest the uplinks.

Problem 3: STP blocks half the links

To prevent loops, the legacy architecture ran STP (Spanning Tree Protocol), which proactively blocks redundant links, leaving only a single active path. The result: you paid for 8 uplinks, but only 4 are operational — the other 4 "sleep" until a primary link fails.

Bandwidth utilization is abysmal.

3. How Spine-Leaf Solves It: The Elegance of the Clos Network

Charles Clos designed a multi-stage non-blocking topology for telephone switching networks in 1952 — rediscovered by data centers 70 years later. The core idea: use only two tiers, but connect every Leaf to every Spine.

Spine tier:   [S1]    [S2]    [S3]    [S4]
               │╲╲╲╲  ╱╱╱╱╲╲╲╲  ╱╱╱╱│
               │ ╲╲╲╲╱╱╱╱  ╲╲╲╲╱╱╱╱ │
               │  ╱╱╱╱╲╲╲╲  ╱╱╱╱╲╲╲╲│
               │ ╱╱╱╱╲╲╲╲  ╱╱╱╱╲╲╲╲ │
Leaf tier:    [L1]    [L2]    [L3]    [L4]
               │       │       │       │
             Servers  Servers Servers Servers

Key structural properties:

  • Every Leaf connects to every Spine — forming a complete bipartite graph
  • Servers connect only to Leaves; Spines do not interconnect
  • Leaves do not interconnect with each other

This structure produces three fundamental improvements:

Improvement 1: Any two servers are at most 2 switch hops apart

Server A → Leaf1 → Spine2 → Leaf3 → Server B. Always 2 switch hops, yielding predictable, stable latency.

Improvement 2: ECMP puts all links to work simultaneously

ECMP (Equal-Cost Multi-Path): Leaf1 has 4 uplinks to 4 Spines? The routing protocol (BGP / OSPF) hash-distributes traffic across all 4, with every link operating at capacity. No STP "sleeping links."

Bandwidth utilization jumps from ~50% to ~95%+.

Improvement 3: Horizontal scale-out

Need more bandwidth? Add Spines. Need more ports? Add Leaves. Two dimensions scale independently. Unlike the three-tier model, where the core layer hitting its limit forced a full-architecture teardown and rebuild.

4. What Is the Scale Ceiling?

Two-tier Clos (standard Spine-Leaf) — Theoretical limit:

Assume 64-port 800G switches as Spines, 32 ports facing Leaves and 32 reserved for redundancy:

  • Spine count = 32 (every Leaf connects to every Spine)
  • Leaf count = 64 (every Spine connects to every Leaf)
  • Each Leaf serves 32 downstream servers
  • Total: 32 × 64 = 2,048 Leaf-facing ports → approximately 6,000 servers (accounting for practical oversubscription and headroom)

Five-stage Clos (also called three-tier Clos / Super-Spine architecture):

When two tiers are insufficient, a third "Super-Spine" tier is added, interconnecting multiple Spine-Leaf modules (called Pods):

              Super-Spine (top tier)
             ╱      │      │      ╲
        Spine-A  Spine-B  Spine-C  Spine-D
           │       │       │       │
         Leaves  Leaves  Leaves  Leaves
           │       │       │       │
        Servers Servers Servers Servers
        (Pod 1)  (Pod 2)  (Pod 3)  (Pod 4)

Why "five stages"? Because from source to destination a packet traverses 5 switching elements: Leaf → Spine → Super-Spine → Spine → Leaf.

This architecture supports 100,000+ hosts — the scale at which Meta, Google, and xAI operate their largest clusters. In AI training clusters, each Pod typically corresponds to a "GPU training compartment" (e.g., 1,024 GPUs), with inter-Pod communication traversing the Super-Spine.

5. The Routing Protocol Stack: eBGP + EVPN + VXLAN

Why three protocols? Because a modern data center network must solve three distinct problems.

Problem A: How do switches discover each other? → eBGP (Underlay)

Underlay = the physical network layer. Leaves and Spines need a routing protocol to exchange "where I am and what I can reach" information.

Historically OSPF was common, but modern data centers overwhelmingly choose eBGP (External BGP), for these reasons:

  • BGP was designed as the Internet backbone protocol — it natively supports massive scale
  • Configuration assigns each switch its own AS (Autonomous System) number, delivering excellent fault isolation
  • Unifies with WAN routing protocols, producing a consistent operations model
  • Native, mature ECMP support

This is the basis for the statement that "eBGP is the dominant Underlay protocol."

Problem B: How do VMs / containers communicate across subnets? → VXLAN (Overlay)

Overlay = a logical network layered on top of the physical network.

A modern physical server runs dozens of VMs or containers, potentially belonging to different tenants and subnets. VXLAN (Virtual Extensible LAN) encapsulates Ethernet frames inside UDP packets, allowing VMs to behave as if they share a single VLAN while the underlying physical network can be any topology.

Analogy: VXLAN is like a shipping container — it packages diverse parcels (VM traffic) into standardized containers (VXLAN packets) for transport across the physical network, then unpacks them at the destination.

Problem C: Which VM is on which server? → EVPN (Control Plane)

VXLAN solves "how to encapsulate" but not "where to send." EVPN (Ethernet VPN) is VXLAN's "address book" — each Leaf uses EVPN to announce to all other Leaves: "I have VM-A and VM-B attached; their MAC/IP addresses are xxx."

How the three work together:

Application:   VMs / Containers
                ↕ (transparent)
Overlay:       VXLAN (data encapsulation) + EVPN (control signaling)
                ↕ (runs on top of the physical network)
Underlay:      eBGP (physical routing)
                ↕
Physical:      Spine-Leaf switches + fiber optic cabling

This stack has been the "standard answer" for data center networking over the past decade — the underlying network architecture of AWS, Azure, and Alibaba Cloud all follows this paradigm.

6. The Data Center Switching ASIC Landscape: Broadcom Dominance and the Breakout Attempts

Broadcom Tomahawk 5: The Reigning Champion

Broadcom Tomahawk 5 (51.2 Tbps, 64 × 800 GbE) — released in 2023 — is the flagship switching ASIC and the de facto standard for 800G-class data center switches.

  • 51.2 Tbps: Single-chip aggregate switching capacity
  • 64 × 800 GbE: 64 ports of 800G Ethernet
  • Commands 60%+ share of the merchant data center switching-ASIC market

The next-generation Tomahawk 6 has been announced with bandwidth doubling to 102.4 Tbps — 2026 marks the inaugural year for 1.6T-port switching.

Three Vendor Strategies

Strategy A: Buy Broadcom silicon, build the box

  • Arista (ANET): Uses Tomahawk 5; differentiates through its EOS operating-system software. The "Arista 7060X6 series" exemplifies this path. Holds 18.9% of the data center switching market — second only to Cisco.
  • White-box switch ecosystem: ODMs such as Edgecore, Celestica, and Quanta sell bare Broadcom-powered hardware; customers install their own network OS (SONiC, Cumulus). Microsoft, Meta, and Google are major white-box buyers.

Strategy B: In-house ASIC development

  • Cisco Silicon One G300: Cisco's proprietary ASIC powering high-end Nexus 9000 models. Recognizing the strategic risk of Broadcom dependency, Cisco has invested billions over the past five years in custom silicon.
  • NVIDIA Spectrum-X: Born from the Mellanox acquisition — an AI-network-optimized ASIC pursuing both InfiniBand and Ethernet markets.
  • Marvell Teralynx 10: Broadcom's largest merchant-silicon competitor; adopted by AWS and other hyperscalers.

Strategy C: Hyperscaler captive ASICs

  • Google: Proprietary Aquila networking silicon
  • Amazon: Proprietary network ASIC (SiCortex lineage)
  • Meta: Internal white-box program

The high-end data center switching-ASIC market is dominated by a single player — Broadcom — and everyone else is attempting to break out.


400G / 800G Optical Interconnects Entering Large-Scale Deployment

AI Data Center Optical Interconnect Market
                    (2024: $9B → sustained high growth)
                                │
              ┌────────────────┴──────────────────┐
              │                                    │
        Product Generation                   Technology Route
              │                                    │
      ┌───────┼────────┬───────┐        ┌──────┼──────┐
     100G   400G    800G    1.6T       SiPh    CPO    LPO
                                       50%    Hyper-  Transi-
                                      share   scaler  tional
                                      2027    first   solution
                                              choice  2024+
                                              2026+

Short-distance data transport has two paths: copper and fiber.

The physics of copper transmission:

  • Signals propagate as electromagnetic waves in copper; attenuation rises steeply with speed
  • At 100 Gbps on copper, the effective reach is only 3–5 meters
  • At 200 Gbps this shortens to 1–2 meters
  • At 400 Gbps copper is essentially unusable — unless DAC (Direct Attach Copper) is employed at extremely short distances

The physics of fiber:

  • Signals are light pulses propagating through glass fiber at near-light speed
  • Attenuation is minimal — single-mode fiber can run 10 km with very low loss
  • Bandwidth is virtually unlimited (theoretically reaching multi-Tbps)

Data center reality:

  • Intra-rack (< 2 m): copper / DAC (cheapest)
  • Inter-rack (within a row, meters to tens of meters): fiber is mandatory (multimode fiber + optical transceiver)
  • Inter-Pod / inter-hall (tens to hundreds of meters): fiber is mandatory (single-mode fiber)

In AI clusters, GPUs must communicate across racks — a single NVL72 holds only 72 GPUs, but training a GPT-4-class model requires 10,000+ GPUs. This means massive inter-rack communication, all carried over fiber.

What an Optical Transceiver Does

An optical transceiver (optical module) performs one conceptually simple but engineering-extreme task — converting electrical signals into optical signals for transmission, and converting incoming optical signals back to electrical signals.

Physical structure:

Switch ASIC ──electrical──→ [Optical Transceiver]
                             ├── Modulator (E→O) ── fiber ──→
                             └── Detector  (O→E) ←── fiber ←──
                           ──electrical──→ Switch ASIC

Each transceiver plugs into a port on a switch or GPU NIC, bridging racks via a fiber-optic cable. A 51.2 Tbps switch with 64 × 800G ports requires 64 × 800G optical transceivers.

Transceiver count in a GPU cluster:

For a 100,000-GPU training cluster, under typical architecture:

  • Each GPU requires 4–8 outbound connections (NVLink Switch / Scale-out)
  • All inter-rack segments use optics
  • Total transceiver count: hundreds of thousands to over one million units

At a unit price of $700–$1,500 per 800G transceiver, a single cluster's optics alone represent a several-hundred-million to multi-billion-dollar market. This is why optical transceivers are among the most certain high-growth segments in AI infrastructure.

Generational Progression

Generation Mainstream Era Per-Port Data Rate Module Power Approx. Unit Price
100G 2018–2022 100 Gbps 3.5–5 W $200–400
400G 2023–2025 400 Gbps 8–12 W $500–1,000
800G 2024–2026 800 Gbps 14–20 W $700–1,500
1.6T 2025–2027 1,600 Gbps ~25–30 W $1,500–3,000
3.2T 2027+ 3,200 Gbps TBD TBD

"400G delivers 4× the bandwidth at 2.5–3× the module cost of 100G" — this is the signature of a generational leap: each new generation typically delivers 4× bandwidth for only 2–3× the cost increase, so cost-per-Gbps declines 30–50%. This is why data centers upgrade rapidly once a new generation matures.

Per-Gbps power consumption falls, but absolute module power keeps rising (5 W → 15 W → 25 W). This creates a new engineering problem — the optical transceivers themselves become major heat sources. A 64-port 800G switch dissipates 64 × 18 W ≈ 1.15 kW in transceiver power alone — more than the switching ASIC itself.

Packaging Formats: OSFP / QSFP-DD / OSFP-XD

An optical transceiver's "physical shell + electrical interface" standard is called its form factor. At the same 800G data rate, different form factors yield different thermal and density characteristics.

QSFP-DD (Quad Small Form-factor Pluggable Double Density)

  • The traditional mainstream data center form factor
  • Compact form factor, high port density (more modules per switch faceplate)
  • Drawback: Limited thermal envelope — at 800G, temperatures can exceed safe thresholds

OSFP (Octal Small Form-factor Pluggable)

  • Slightly larger than QSFP-DD, with an integrated metal heat-sink fin
  • 15 °C cooler than QSFP-DD — a margin sufficient for reliable 800G long-duration operation
  • Therefore the preferred form factor for AI high-density 800G deployments — NVIDIA Quantum / Spectrum platforms use OSFP

OSFP-XD (OSFP eXtended Density)

  • The 1.6T evolution of OSFP; each module supports 2 × 800G lanes or 1 × 1.6T lane
  • The statement that "92% of hyperscaler contracts" have aligned on OSFP-XD refers to the 1.6T-era form-factor consolidation among leading operators

The form-factor debate in essence: QSFP-DD is smaller but thermally constrained; OSFP is somewhat larger but can reliably sustain 800G / 1.6T. AI clusters prioritize reliability over density — a single transceiver failure breaks an entire GPU communication chain — so OSFP wins.

Technology Routes for Power and Cost Reduction

Route A: Silicon Photonics (SiPh)

Conventional transceivers use indium phosphide (InP) or gallium arsenide (GaAs) lasers — expensive processes with limited wafer-scale capacity.

Silicon Photonics integrates optical devices directly onto a silicon die — fabricating photonic elements using CMOS semiconductor processes.

Advantages:

  • Leverages existing large-scale semiconductor fabs (TSMC, Intel can both produce them) — cost decreases and capacity elasticity increase
  • High integration density (multi-channel optical paths on a single silicon die)
  • Projected to capture 50% of the optical transceiver market by 2027

Key players:

  • Intel: Earliest mover in SiPh, though commercialization has lagged expectations in recent years
  • Zhongji Innolight (300308.SZ): "Shipping SiPh-based modules at volume starting Q2" — SiPh becomes the core technology path for its 1.6T portfolio
  • Coherent / Lumentum: Legacy optical-component giants transitioning to SiPh
  • Ayar Labs: CPO-focused SiPh startup with a valuation exceeding $1B

Route B: CPO (Co-Packaged Optics)

The most radical route — soldering the optical engine directly adjacent to the switching ASIC.

Conventional topology (pluggable transceivers):

ASIC ── electrical signal (long distance, high power) ── port ── [Pluggable Transceiver] ── fiber

CPO topology:

ASIC ── optical engine (soldered directly beside the die) ── fiber

The electrical-signal path from ASIC to photonic device shrinks from centimeters to millimeters, drastically reducing electrical attenuation. Power drops from ~15 pJ/bit to ~5 pJ/bit (a 65–73% reduction).

Advantages:

  • Major power reduction (saves several hundred watts per 51.2T switch)
  • Higher density (eliminating the pluggable cage frees PCB real estate)
  • Better signal integrity (short-distance electrical transmission)

Disadvantages:

  • Not field-replaceable — if the optical engine fails, the entire switch must be swapped
  • Maintenance paradigm changes completely (data center operations teams must be retrained)
  • Manufacturing process is extremely complex (heterogeneous opto-electronic integration)

If CPO scales, 30–50% of traditional pluggable transceiver demand could be displaced — transceiver vendors must pivot to CPO optical-engine manufacturing or face marginalization. Both Zhongji Innolight and Eoptolink are racing to secure this segment.

Route C: LPO (Linear Pluggable Optics)

The "dark horse" route that emerged suddenly in 2023–2024.

Conventional transceivers include a DSP (Digital Signal Processor) chip responsible for signal compensation, error correction, and equalization — the DSP accounts for 50% of the transceiver's power consumption.

LPO's approach: Remove the DSP entirely — let the switch ASIC's electrical output run "bare" to the photonic device, relying on the intrinsic linearity of the optical path for transmission.

Result: 800G transceiver power drops from 13 W to < 4 W (a 70% reduction).

Advantages:

  • A brute-force power-savings approach — no CPO-style manufacturing revolution required
  • Retains pluggability (compatible with existing data center architectures)
  • Lower cost (eliminates a DSP die)

Disadvantages:

  • Effective only at short distances (< 100 m) — insufficient signal compensation for longer reaches
  • Imposes higher signal-quality requirements on the ASIC (DSP's former workload shifts to the ASIC side)

LPO vs. CPO:

Dimension LPO CPO
Magnitude of change Small (DSP removal) Large (packaging overhaul)
Power reduction ~70% ~65–73%
Pluggability Preserved Sacrificed
Commercialization timeline 2024–2025, already deployed 2026+
Best-fit scenario Intra-data-center Hyperscale deployments

LPO is the "transition within the transition" — a low-risk solution filling the 2–3 year window before CPO fully matures.


DCI Optics: Pluggable Coherent Technology Reshaping the Interconnect Landscape

The optical transceivers discussed above operate inside a data center — switch-to-switch, GPU-to-GPU, rack-to-rack (meters to hundreds of meters). DCI (Data Center Interconnect) transceivers operate between data centers — linking facilities separated by kilometers to thousands of kilometers (80–2,000 km).

Dimension Intra-DC Transceivers DCI Transceivers
Growth driver AI intra-cluster bandwidth demand Cross-campus training + cloud interconnect
Market size $9B (2024) $10.7B (2024)
Growth rate ~25–30% ~13% overall; high-speed segment 145%
Concentration Moderate (top 5 hold ~60%) High (top 3 hold 70%+)
Chinese market influence Dominant Relatively weaker
Margin profile High (gross margin 30–40%) Moderate (gross margin 25–30%)
Primary beneficiaries Zhongji Innolight, Eoptolink, O-Net Ciena, Marvell, Cisco

Why DCI Is Fundamentally Harder

Optical signals attenuate in fiber — much more slowly than in copper, but cumulatively over distance:

Distance Attenuation Required Handling
100 m < 0.5 dB Direct detection
10 km ~3 dB Simple detection
80 km ~16 dB Optical amplification required
1,000 km ~200 dB (direct transmission impossible) Multi-stage amplification + coherent detection

DCI's core technical challenge: how to faithfully reconstruct a signal after it has traveled hundreds to thousands of kilometers.

Intra-DC vs. DCI Transceiver Comparison

Dimension Intra-DC Transceivers DCI Transceivers
Typical distance 100 m – 2 km 80 km – 2,000 km
Typical data rate 100G / 400G / 800G / 1.6T 100G / 400G / 800G / 1.2T
Key technology NRZ / PAM4 direct modulation Coherent modulation + DSP
Modulation scheme Simple (intensity modulation) Complex (phase + amplitude + polarization)
Price $500 – 1,500 $5,000 – 50,000
Power 5 – 25 W 15 – 25 W
DSP complexity Simple or none (LPO) Extremely complex (core value proposition)
Typical customer Data center operator Cloud provider + telecom carrier
Market size 2024 ~$9B ~$10.7B
Core players Zhongji Innolight, Eoptolink, Coherent Ciena, Cisco, Huawei, Acacia

Key differentiators:

  • DCI modules cost 10–50× more — because DSP complexity is on a different order of magnitude
  • DCI is telecom-grade technology migrating into the data center — legacy players are Ciena, Nokia, and Huawei — telecom equipment vendors
  • The core value in a DCI module resides in the DSP die — the optical-component portion is relatively standardized

IP-over-DWDM: The Architectural Simplification Revolution

Legacy DCI Architecture

Data Center A
  ┌───────────┐         ┌────────────────┐         ┌────────────┐
  │  Router / │ ─gray─→ │  Transponder   │ ─color─→│    DWDM    │ ──→ Fiber ──→
  │  Switch   │  optic   │  Chassis       │  optic  │  equipment │
  └───────────┘         │  (proprietary  │         └────────────┘
                        │   coherent)    │
                        └────────────────┘
   IP / Ethernet layer    Coherent optical layer     Optical transport layer

Three independent equipment layers — each from a different vendor, each with its own management plane, each consuming separate floor space.

IP-over-DWDM Architecture

Data Center A
  ┌─────────────────┐         ┌────────────┐
  │  Router / Switch│ ─color─→│    DWDM    │ ──→ Fiber ──→
  │  (ZR module     │  optic  │  equipment │
  │   plugged in)   │         └────────────┘
  └─────────────────┘
   IP + coherent unified       Optical transport layer

A ZR coherent module plugs directly into the router port — the entire transponder-chassis layer disappears.

ZR is the "pluggable coherent module standard" family defined by OIF (Optical Internetworking Forum). Previously, DCI required purchasing Ciena's (or equivalent) proprietary transponder chassis — a rack-sized appliance housing proprietary coherent technology, physically separate from the switch.

The ZR standard's revolutionary impact:

  • Multi-vendor interoperability: The OIF standard allows ZR modules from different vendors to be mixed
  • Direct insertion into switch / router ports: No standalone transponder chassis required

The resulting benefits:

  • CapEx savings of 28–38%: Eliminating an entire equipment tier
  • Space savings of 50%+: The transponder room is no longer needed
  • Power reduction of 30–40%: One fewer OEO (optical-electrical-optical) conversion stage
  • Unified operations: Router and optical layers managed within a single control system

OLS (Open Line System) — The Next Wave

OLS will make the DWDM optical layer itself an open standard, enabling hyperscalers to mix and match equipment from different vendors:

  • Amplifiers from Vendor A
  • ROADMs (Reconfigurable Optical Add-Drop Multiplexer) from Vendor B
  • ZR modules from Vendor C
  • SDN control system from Vendor D

The entire DCI industry is undergoing the same "white-boxing" process that data center switching experienced:

Phase Switching Industry DCI Industry
1.0 Black Box Cisco winner-take-all Ciena / Huawei turnkey solutions
2.0 Merchant Silicon Broadcom + multiple ODMs Multi-vendor ZR modules
3.0 White Box + Open Source SONiC + white-box switches OLS + open management

The moats of legacy DCI giants like Ciena are being rapidly eroded by OIF standardization and the bargaining power of hyperscale buyers — this is the most consequential structural shift in the DCI industry.


Evaluation Framework: Data Center Acquisition Due Diligence

Due diligence follows the L1 → L2 → L3 → L4 sequence:

  1. L1: Confirm the supply ceiling and scalability of power, land, water, and connectivity
  2. L2: Assess the age and retrofit headroom of M&E systems (liquid-cooling readiness)
  3. L3: Evaluate rack / Pod power-density ceiling and network topology
  4. L4: Assess the generational currency of compute hardware (AI-readiness)

0. Executive Summary

Project Name:
Asset Type: (Wholesale / Retail Colo / Enterprise Conversion / AI Training / Edge)
Location:
Seller:
Deal Structure: (Equity / Asset Acquisition / JV)
Asking Price:
Recommendation: Proceed / Proceed with Conditions / Reprice / Pass

Investment Thesis

  • One-line investment rationale:
  • One-line kill risk:
  • Value-creation pathway: (Capacity expansion / Liquid-cooling upgrade / Tenant optimization / Pricing uplift / Operational efficiency)
  • Top 3 questions for the Investment Committee:
    1.
    2.
    3.

Evaluation Methodology

A. Fatal-Flaw Gate

  • If any single Fatal Flaw item fails, the default recommendation is Pass.

B. Four Core Scoring Dimensions

  • As-Is Quality: 35%
  • To-Be Upgrade Value: 30%
  • Execution Certainty: 20%
  • Demand Fit: 15%

C. Valuation — Standalone Module
Valuation is decoupled from the technical scorecard and assessed independently:

  • Does the current bid already price in expansion / liquid-cooling / AI premium?
  • How much incremental CapEx is required to unlock future value?
  • Does the risk-adjusted IRR hold?

1. Deal Snapshot

1.1 Asset Profile

  • Asset name:
  • City / Campus:
  • Year built / Phasing dates:
  • Gross building area:
  • White-space area:
  • Land area:
  • Tenure: Freehold / Leasehold / Hybrid
  • Current use: Wholesale / Retail / Enterprise captive / Hybrid
  • Target use: Maintain current profile / AI upgrade / Conversion

1.2 Capacity & Delivery

  • Contracted utility power (MW):
  • Actual deliverable IT capacity (MW):
  • Currently utilized IT capacity (MW):
  • Remaining available capacity (MW):
  • Expansion capacity (MW):
  • Current prevailing rack density:
  • Maximum supportable rack density:
  • Current cooling modality:
  • Liquid-cooling readiness (L0–L5):

1.3 Commercial Overview

  • Occupancy rate:
  • Top-5 tenant revenue concentration:
  • Largest single-tenant revenue share:
  • Weighted Average Lease term (WAL):
  • Average unit price ($/kW/month or RMB/kW/month):
  • Contractual escalator:
  • Current EBITDA:
  • Current EBITDA margin:

2. Fatal-Flaw Screen

If any single item fails, the asset does not advance to the weighted scoring phase — unless the deal is restructured with a price adjustment or binding remediation conditions.

Gate Item Core Question Result Notes
Power accessibility Is there verifiable, deliverable current / incremental MW — not merely paper MW? Pass / Fail
Cooling upgrade feasibility Can the target density be achieved within reasonable CapEx and construction timeline? Pass / Fail
Structural capacity Do floor loading, clear height, routing, and live loads support the target AI deployment? Pass / Fail
Regulatory & permit closure Are there material obstacles in land-use, EIA, energy review, fire code, grid interconnection, or data compliance? Pass / Fail
Baseline reliability Is there a history of major outages, maintenance backlog, or single points of failure? Pass / Fail
Network minimum threshold Does the asset meet the carrier / fiber / cloud on-ramp floor required by target tenants? Pass / Fail

Fatal-Flaw Conclusion

  • Pass: Proceed to full due diligence
  • Conditional Pass: Enumerate pre-conditions
  • Fail: Recommend Pass / Material reprice

3. As-Is Quality Score (Today Value)

This section answers one question: Is this asset worth acquiring in its current state?

Total: 100 points — used solely to characterize current-state quality; does not substitute for the investment recommendation.

3.1 Power Systems (20 pts)

  • Actual deliverable IT MW
  • Redundancy architecture (N / N+1 / 2N)
  • UPS / genset / switchgear age
  • Distribution topology and bottlenecks
  • Power quality / harmonics / load balancing
  • Electricity tariff terms / PPA / pass-through mechanism

3.2 Cooling & Energy Efficiency (15 pts)

  • Current PUE / WUE
  • Current cooling architecture
  • Current supportable density
  • Liquid-cooling readiness
  • Redundancy and maintainability
  • Verified historical efficiency curves

3.3 Network & Interconnection (15 pts)

  • Number of carriers
  • Carrier neutrality
  • Fiber-path diversity
  • MMR capacity
  • IX / cloud on-ramp presence
  • Latency and cross-network quality

3.4 Physical & Structural (10 pts)

  • Construction vintage and major-refresh history
  • Floor-loading capacity
  • Clear height / column grid / routing pathways
  • Flood / seismic / fire / physical security resilience
  • Brownfield conversion friendliness

3.5 Operations & Reliability (15 pts)

  • Tier / ISO / SOC certifications
  • DCIM / BMS / SCADA maturity
  • AIOps / predictive-maintenance capabilities
  • Historical outage / near-miss record
  • Maintenance backlog
  • MTBF / MTTR
  • Staff-per-MW ratio

3.6 Commercial Quality (15 pts)

  • Tenant concentration
  • WAL
  • Rental levels
  • Occupancy rate
  • Churn risk
  • Renewal probability
  • Contract-terms quality

3.7 Compliance & ESG Baseline (10 pts)

  • Land / construction / EIA / fire / energy-review permit completeness
  • Local PUE / energy-use / data-compliance status
  • Renewable-energy share
  • ESG reporting maturity
  • Carbon / water-usage disclosure capability

4. To-Be Upgrade Value Score (Future Value)

This section answers: After acquisition, can this asset be transformed into a materially more valuable property?

4.1 Power Expansion Runway

  • Locked but unactivated MW
  • Substation expansion pathway
  • Utility-queue risk
  • Continuous large-block delivery capability
  • Contiguous-capacity scarcity value

4.2 AI / Liquid-Cooling Upgrade Path

  • Target supportable rack density
  • CDU / manifold / piping installation feasibility
  • Air-to-liquid conversion CapEx
  • Ability to form AI-ready Pods
  • Target workload alignment (training / inference)

4.3 Land & Phase-2 Development

  • Land reserves
  • Developable FAR / floor-area ratio
  • Phased expansion conditions
  • Permitting pathway and expected timeline

4.4 Network Ecosystem Extensibility

  • Difficulty of adding carriers
  • Cloud-direct connectivity expansion potential
  • Attractiveness uplift for ecosystem-oriented / retail tenants

4.5 Operational Upgrade Headroom

  • DCIM integration
  • Automation uplift potential
  • PUE improvement headroom
  • Staff-efficiency improvement potential

5. Execution Risk Score (Can We Actually Deliver?)

This section answers: The upgrade is theoretically possible — but can it actually be executed?

5.1 CapEx Risk

  • Required CapEx
  • Discretionary CapEx
  • Hidden CapEx (latent remediation, code-compliance gaps)
  • Cost sensitivity

5.2 Engineering Execution Risk

  • Is a shutdown required for construction?
  • Will existing tenants be impacted?
  • Critical equipment lead times
  • Construction windows
  • Structural retrofit complexity

5.3 Permitting & Policy Risk

  • Power-supply approvals
  • Land / construction permits
  • Energy-efficiency / PUE regulatory requirements
  • Data sovereignty / cybersecurity constraints

5.4 Operational Integration Risk

  • DCIM / BMS data migration
  • Organizational consolidation
  • Vendor switchover
  • O&M team retention and stability
  • SLA / penalty-payment history

5.5 Execution Conclusion

  • Low Risk / Medium Risk / High Risk
  • Top 3 execution pitfalls most likely to materialize:
    1.
    2.
    3.

6. Demand Fit

This section answers: "Who will lease, how fast will stabilization occur, and why this asset instead of a competitor?"

6.1 Target Tenant Segments

  • Hyperscale
  • Neo-cloud / GPU cloud
  • Enterprise AI private cluster
  • Retail colocation
  • Edge / inference

6.2 Demand–Supply Match

  • What capacity type is the market most undersupplied with today?
  • Which workload does this asset best serve?
  • Training / inference / cloud / enterprise — which is the optimal fit?
  • Does the asset offer contiguous large-block scarcity?

6.3 Commercialization Pathway

  • Current pipeline / LOI / pre-lease status
  • Expected lease-up timeline
  • Pricing power
  • Differentiation vs. competitive set

6.4 Demand Fit Conclusion

  • Strong / Moderate / Weak
  • Key rationale:

7. Financial Bridge & Valuation (Underwriting Bridge)

This section answers: Translate the technical conclusions into financial conclusions.

7.1 As-Is Valuation

  • Current EBITDA:
  • Applied multiple:
  • As-Is EV:
  • Implied $/MW:
  • Implied $/kW:
  • Comparable-transaction benchmarking:

7.2 Value-Creation Bridge

Line Item Impact on NOI / EBITDA Notes
Incremental deliverable MW
Pricing uplift from liquid-cooling upgrade
OpEx savings from PUE improvement
Occupancy uplift
Tenant-mix optimization
Operational automation savings

7.3 Upgrade Case

  • Upgrade CapEx:
  • Stabilized EBITDA:
  • Stabilized valuation multiple:
  • Stabilized EV:
  • Value-creation delta:
  • IRR / MOIC:

7.4 Downside Case

  • Expansion delay
  • Utility delivery shortfall
  • Liquid-cooling deployment lag
  • Lease-up slower than underwritten
  • CapEx overrun
  • Exit-multiple compression

8. Investment Committee Decision Page

8.1 Recommendation

Proceed / Proceed with Conditions / Reprice / Pass

8.2 Rationale (Top 3)

8.3 Key Risks (Top 3)

8.4 Pre-Close Conditions Precedent (CPs)

8.5 First-100-Days Post-Close Plan

  • Power & cooling validation
  • DCIM / O&M takeover
  • CapEx budget lock
  • Tenant engagement & pipeline verification
  • Phase-2 / expansion milestone roadmap

Glossary

Abbreviation  Full Name                             Layer           Function
──────────────────────────────────────────────────────────────────────────────────
AC            Alternating Current                   Physics         Alternating-direction electric current
DC            Direct Current                        Physics         Constant-direction electric current
LVMS          Low Voltage Main Switchboard          Building (L2)   Low-voltage main distribution panel
UPS           Uninterruptible Power Supply          Building (L2)   Instantaneous backup power
ATS/STS       (Static) Transfer Switch              Building (L2)   Power-source changeover switch
PDU           Power Distribution Unit               Room / Rack     Power dispatching and metering
BESS          Battery Energy Storage System          Campus / Bldg   Large-scale energy storage
BBU           Battery Backup Unit                   Rack (L3)       In-rack backup power
BMS           Battery Management System             Rack (L3)       Cell-level monitoring & protection
PSU           Power Supply Unit                     Server (L4)     AC → DC (or 48 V → 12 V) conversion
VRM           Voltage Regulator Module              Motherboard(L4) 12 V → sub-1 V die voltage
OCP           Open Compute Project                  Standards body  Open hardware specifications