CDU Deep Comparison

Every aspect — issues, controls, support & TCO

The CDU Selection Guide compares capacity and links. This companion goes deeper across the aspects that actually decide a deployment: the field failure modes that bite worldwide, how the control system regulates the loop and talks to the BMS/DCIM, the after-sales reality per vendor, and the TCO & maintenance burden by type. Every figure is tagged by source so a vendor claim is never mistaken for a published standard.

STANDARD ASHRAE / OCP / DMTF Redfish  ·  VENDOR vendor datasheet/claim  ·  REPORTED widely reported, not quantified to a named standard
DC SolutionsCDU Guide → Deep Comparison

01 Worldwide common field issues & failure modes

The three universal enemies of any cooling loop are corrosion, mineral scale and microbiological fouling ASHRAE TC 9.9 5th ed.. The CDU's core job is to hydraulically decouple the chip-side TCS loop from the facility FWS loop. Below are the failure modes operators actually report, with the prevention that addresses each.

IssueSymptomRoot causePrevention
Leaks REPSlow seepage / droplets at QDs, gaskets, plate-HX joints; most start small.QD not fully seated, bypassed interlocks, debris in coupling, rushed swaps, pump-seal failure; unclear who owns wet connections.Dry-break / dripless blind-mate couplers; drip trays + secondary containment; leak sensors at QDs/hoses/manifolds/low points; clear IT-vs-facilities boundary. ASHRAE 5th ed. formalises dry-break QDs + segmented shutoff valves
Biofilm / biological fouling REPBiofilm restricts microchannel flow, raises hydraulic resistance, cuts efficiency.Microbial growth — worst with plain DI water (no biocidal protection).PG-25 (25% propylene glycol) suppresses microbes far better than DI OCP rationale; coolant-quality + filtration monitoring; TCS filtration 50→25 µm cuts the particulate substrate.
Galvanic / chloride corrosion REPLocalised pitting through the oxide layer; particles shed into coolant; eventual cold-plate/HX failure.Dissimilar metals in a conductive fluid (Al anode vs Cu cathode); chloride drives pitting; falling inhibitor reserve = active attack.Chloride < 25 ppm (consensus, not one named standard); mandatory azole inhibitor if mixed metals; prefer single-metal (copper) wetted paths; monthly inhibitor-reserve test.
Glycol / PG degradation VENpH drifts down; fluid turns acidic and corrodes Cu/Al/steel.PG oxidises into organic acids (glycolic/lactic/formic), consuming alkalinity; makeup water dilutes inhibitor.Hold pH ~8.0–10.5 (below 7.0 nonferrous corrodes fast); test pH, reserve alkalinity, glycol %, conductivity — baseline at commissioning, full panel at 3 & 6 mo then annually, quarterly pH spot-checks. Inhibited OAT glycol can extend life ~2–3→8–10 yr vendor claim.
Particulate / microchannel clog VENClogged filters/fins, insulating scale, rising ΔP, abrupt shutdowns.Poor water quality + weak filtration; corrosion fines abrade narrow channels at high velocity.CDU supply filtration 50→25 µm as chip microchannel pitch narrows; side-stream/sub-micron polishing; trend filter ΔP; commissioning flush before operation.
Flow maldistribution REPUneven flow → hot racks/sleds; in two-phase, cold-plate dry-out (quality → >100%).Nonuniform heating raises ΔP with vapour, so hotter sleds get less flow (self-reinforcing); unbalanced hose routing.Manifolds with balancing valves; dedicated inlet/outlet paths; flow restrictors for even two-phase split; CDU VFD tied to ΔT + dP setpoints.
Air entrainment / cavitation VENVapour in low-pressure zones; impeller erosion; pumps short of rated life.Low NPSH available, poor inlet geometry, trapped air, incomplete fill/bleed after service.Cavitation-resistant inlet; purge/bleed modes; ≥10–20% NPSH margin; size pump near best-efficiency point; air-ingress alarms.
Condensation / dew-point sweat VENSweating on cold plates/pipes/IT — catastrophic on electronics.TCS supply driven below room dew point.Hold TCS supply ≥2–3 °C above room dew point (dew-point-aware reset); coolant preheat; humidity sensors in controls. ASHRAE W-classes cap facility supply temps
Pump wear / failed N+1 VENPremature pump failure; loss of flow if failover doesn't engage.Oversizing (off-BEP), inadequate NPSH, seal wear, vibration; unproven failover.N+1 with auto-failover + isolation valves (PLC <100 ms); seal-less / mag-coupled pumps for online service; size near BEP; predictive analytics.
Controls / integration gaps REPFalse leak alarms; sensor drift/EMI; out-of-range flow/pressure; unclear alarm response.Rope sensors tripped by residue; aging/uncalibrated sensors; CDU↔BMS↔DCIM integration gaps; blurred monitoring ownership.Integrate via Modbus/BACnet/dry-contact into BMS/DCIM; calibrate at commissioning; standard false-alarm procedure (dry, clean, function-test); continuous trending + automated alarms.
Commissioning defects REPEarly-life fouling, contamination, leaks, weak heat transfer, "blame-the-chiller" misdiagnosis.Pre-charge fluid not flushed; wrong glycol % (high → viscosity penalty, low → lost freeze/biocide); incompatible refill; cross-mated QDs.Manufacturer fill/flush with a flushing skid; staged filtration + adequate flush velocity; QC on pH/turbidity/inhibitor/glycol %. OCP pre-commissioning prep for TCS row manifolds
Fluid / standardization gaps STDProprietary coolants/connectors with poor cross-vendor interoperability across refresh cycles.Historically vendor-specific connectors, hoses, manifolds and fluids.OCP Cooling Environments standardises interfaces/operating params; OCP PG-25 guideline specifies wetted-material compatibility, tubing, temp/pressure, filtration + safety for multi-vendor supply.
Reported context (attribute carefully). Uptime Institute 2024: ~1 in 3 outages cost >$250k and cooling ≈19% of outages, rising as halls shift air→liquid REPORTED. Integrated leak detection cuts response to ~8–12 min vs 2–4 hr for visual REPORTED. ASHRAE TC 9.9 "Liquid Cooling: Resiliency Guidance for Cold-Plate Deployments" (Sept 2024) governs ride-through within the server's minimum time-to-throttle STANDARD (paywalled — verify before quoting numbers).

02 Control systems & BMS / DCIM integration

A CDU runs two loops at once: a VFD pump loop for hydraulic delivery and a temperature loop that modulates the primary control valve to hold secondary supply temp. The fullest public description of the actual algorithms is the Lenovo Neptune RM100 O&M guide (built by Cooltera); DMTF Redfish + OCP anchor the standards layer.

How the loops are regulated VENDOR — Lenovo RM100

LoopControlsStrategy / detail
Pump (VFD)Flow or ΔP controlRamp pump speed until measured flow = setpoint, or until supply-return ΔP = setpoint (10 s scan to avoid oscillation). ΔP control is the common default for direct-to-chip with many parallel server branches (holds head as node valves open/close); flow control where a fixed total delivery is the target. Supply temp is not a pump-loop variable.
Temperature (PID + valve)PID on primary 2-way / 3-way valveModulates 0–100% flow (or 0% bypass → 100% HX). Demand-vs-feedback checked every 15 min; >10% deviation → valve fault. Loss of valve signal fails to bypass/closed = no cooling. Retune via Ziegler-Nichols (PI for slow loads, PID for fast).
Setpoint / resetFixed SP / SP + dew-point offsetRM100 default secondary setpoint 18 °C. Vertiv XDU1350 secondary range 10–52 °C, dew-point control standard. Framed by ASHRAE W-classes (W17/W27/W32/W40 new/W45/W+) STANDARD.
Dew-point resetRoom temp + RH monitorIf dew point rises within 3 °C of setpoint, the CDU re-adjusts to stay ≥3 °C above dew point; RH-sensor failure → safe fallback to Fixed SP 18 °C. ASHRAE concurs: coolant must stay above dew point
N+1 pump changeoverLead/standby, 7-day dutyChangeover ~0.25 s; on restart picks lowest-runtime pump; if a pump can't reach 90% of demand within 100 s it stops, standby starts, alarms raise; both failing → latching shutdown.
Leak → actionLevel sensor + pressure interlockLatching SHUTDOWN if level-open AND flow/dP <50% setpoint (1 s delay); external rope/spot leak-tape connector. Boyd alternative = bypass-loop standby. A dedicated motorised leak-isolation-valve algorithm is not publicly disclosed — documented responses are pump-shutdown or bypass-standby.

BMS / DCIM protocols actually supported (per vendor)

Vendor / productModbusBACnetSNMPRedfishNotes
Vertiv XDU1350RTU+TCPYesCLI, web server, 7″ HMI
CoolIT CHx/AHx (CHx2000)YesYesYesYes"+ many others"; group control 20 units
Motivair / by SchneiderYes (+LON)MS/TP + IPYesEcoStruxure integration
Lenovo Neptune (XCC)RS485/CANv3RedfishDual Ethernet; XClarity Call-Home
ZutaCoreRESTfulHyperCool Cloud fleet ops
OCP requirementlegacylegacylegacyREQUIREDPrometheus telemetry; pump-RPM + valve-aperture commands
Standards layer STANDARD. DMTF Redfish "Liquid Cooling Equipment" (DSP2064 v1.1.0) defines CoolingUnit / CoolingLoop / CoolantConnector / Pump / LeakDetection / Filters / Reservoirs; CDU controls landed in Redfish 2024.4 (Dec 2024). Settable on CoolantConnector: FlowControlLitersPerMinute, SupplyTemperatureControlCelsius, ReturnTemperatureControlCelsius, DeltaPressureControlkPa, DeltaTemperatureControlCelsius — the standardised encoding of the "which control variable" choice. OCP design flow 1.25–2.0 L/min/kW (1.5 typical) for ≤10 °C rise; Project Deschutes (Google→OCP, 2 MW) uses N+1 seal-less pumps, low-harmonic VFD, 3 °C approach, built-in leak/quality monitoring. Telemetry exposed (RM100, fullest public set): supply/return temps (secondary triple-redundant), ambient + RH + dew point, primary/secondary flow, static/supply pressures + unit ΔP + filter ΔP, per-pump speed/V/A/run-hours, valve demand/feedback, primary & secondary duty (kW), level/leak/filter-dirty.

03 After-sales, support & warranty

Serviceability and support are where deployments live or die over a 5–10 year life. Note two recent ownership moves that change the after-sales entity: Boyd Thermal → Eaton (CDU service is now Eaton's) and Motivair → Schneider Electric (75%). Published warranty lengths are rare — most vendors don't disclose them, so the table marks "n/d" honestly rather than guessing.

VendorWarranty (published only)SLA / responseServiceabilityRemote monitoring
Vertivn/d (optional Prime Labor Warranty; length n/d)Guaranteed Emergency Response tiers (hours n/d); agreements 1→5 yrRedundant pumps, dual feeds, 50 µm filter, 7″ HMIProactive remote monitoring + leak algorithms
CoolITImplied 2–5 yr via SLA alignmentSLA 2–5 yr; PM every 6 moHot-swap pumps/filters/sensors, front+back, N+N (CHx2000); 25 µmRedfish/SNMP/Modbus; 80+ countries direct, 157+ via ASPs
Motivair / Schneider12-mo parts; 4-yr compressor; 2–5 yr extended (needs PM)Platinum 24×7 + 6-hr onsite; 2 PM visits/yrRedundant pumps each with own VFD; corridor service accessCenturion cellular cloud (read-only) + EcoStruxure; 600+ field techs
Boyd / Eatonn/d"quick response" (hours n/d)Hot-swap cold plates + pumps; FRU + depots (Taiwan/USA/Poland)Predictive PM (loss-of-flow, pump-life, hours-based)
nVentUp to 5 yr via annual PM, then lifetime PM (base n/d)PM yearly (hours n/d)Hot-swap pump-filter-drive cartridge, 1 tech <30 min; N+1 seal-less pumps; N+1 isolatable filters; live maintenanceLiquid-quality + leak + telemetry in pump modules
Deltan/dn/dGoCool L2L: hot-swap filtration + dual power (N+1 count unconfirmed)SNMP/Modbus TCP/BACnet; InfraSuite
Stulzn/d8-hr response; 12–36 mo contractsOptional redundant pumps; front+rear serviceable; quick-release sanitary couplings; 50 µmFilter status + level via Modbus/BACnet/SNMP
Accelsius (2-phase)"Multi-year" custom, CNA-underwritten; up to $100k/rack leak cover (NeuGuard)Standard + white-glove (numbers n/d)Hot-swap pumps/PSUs/control boards/sensors; retrofit-ready; non-conductive dielectric leakn/d
ZutaCore (2-phase)References obligations (length n/d)"Time-response objectives" (n/d)Waterless closed loop; minimal scheduled maintenance (no glycol/strainer)Integrated CDU health monitoring; strongest training tier (certification + LMS)
LenovoPremier Support tiers (base n/d; record-retention condition)Scalable 24×7; quarterly health checksRM100 4U ~100 kW; drip-free blind-mate couplers; Commissioning Kit includedXClarity suite + auto Call-Home + Energy Manager (most mature)
Honesty flags. Only Motivair (12-mo parts / 4-yr compressor / 2–5 yr extended) and nVent (up to 5 yr via PM) publish clean warranty figures; CoolIT implies 2–5 yr; Accelsius is "multi-year" custom. Published SLA response hours exist for only Motivair (6-hr onsite) and Stulz (8-hr). Everything marked n/d is genuinely not publicly disclosed — confirm with the vendor.

Per-vendor published-spec & capability matrix

What each vendor actually publishes and supports — distilled from datasheet research. VENDOR figures; ✓ = published/supported, — = not published, ~ = partial/implied. Differential pressure and ASHRAE water class are the most-suppressed specs industry-wide.

VendorRedfishdP / head publishedASHRAE classHot-swap serviceSecondary filtrationNotable
Vertiv✓ (XDU1350: 2.44 bar)✓ W3 / W45~ (XDU070 pumps)25–50 µmModbus/SNMP/CLI; 7″ HMI
CoolIT✓ (35–44 psi)✓ W17–W+✓ N+N pumps/filters25 µm80+ countries service
Motivair / Schneider✓ (32 psi head)~ redundant pumpsn/pPlatinum 6-hr onsite SLA
Boyd / Eaton✓ (up to 80 psi)✓ plates + pumps0.2 µm side-streamDeschutes 2 MW; predictive PM
nVent~ (Deschutes)✓ (2.7 bar)✓ W4✓ cartridge <30 min50 µm (25 opt)seal-less N+N; lifetime PM
Delta~ filtration50 µmSNMP/Modbus/BACnet
Stulz✓ W32–W+~ front+rear50 µm8-hr response; sanitary QR
Accelsius (2-phase)~ W27/W45✓ pumps/PSU/boards20 µmNeuGuard up to $100k/rack leak
ZutaCore (2-phase)~ (REST)✓ (3 / 4.5 bar)✓ W3~ minimal maintn/pwaterless; strong training tier
Lenovo✓ (XCC)~ (relief 3.5 bar)~ blind-mate50 µmXClarity + Call-Home (most mature)
Reading the matrix. Redfish (the OCP-mandated management interface) is published only by CoolIT, Accelsius and Lenovo — everyone else tops out at Modbus/BACnet/SNMP. Published ASHRAE water class is rare for facility-scale L2L (most publish only a rated approach temperature). Serviceability leaders for online maintenance: nVent (cartridge swap), CoolIT (N+N hot-swap), Boyd (hot-swap plates+pumps). Corporate notes: Boyd→Eaton, Motivair→Schneider (75%).

04 TCO & maintenance burden by type

No neutral $/kW CDU price is publicly sourceable — so this is a relative trade-off read, not a quote. The biggest single capex fork is whether the type needs a facility-water plant (L2L) or not (L2A / 2-phase). The biggest maintenance fork is many small units vs few large ones. High-temp water (W40/W45) is what turns direct-to-chip into near-free cooling — the facility loop can reject through a dry cooler / tower most or all hours, eliminating the mechanical chiller.

TypeCapExOpEx / efficiencyDensityWater (WUE)RetrofitMaint. burdenRedundancy / SPOF
In-rack⚠ many units⚠ good✅ 100–300 kWdepends✅ easy❌ many small, high PM✅ small domain / ❌ N+1 per rack costly
In-row⚠ plumbing✅ good✅ 0.6–2.3 MWdepends⚠ row piping✅ few large, easy service❌ central SPOF → N+1
Sidecar⚠ moderate⚠ good✅ ~200 kW✅ if L2A✅ strong⚠ per-rack + airside✅ contained
L2L❌ water plant✅ best (PUE<1.1, free-cool)✅ multi-MW❌ high if evaporative❌ needs water infra✅ pump/filter/HX focus❌ large central SPOF
L2A✅ no water plant❌ PUE<1.2, summer-sensitive⚠ 16–200 kW✅ ~zero✅ best⚠ fan/filter/finned-tube⚠ many small units
2-phase D2C⚠ costly fluid, no water plant✅ ~35% OpEx*, low pump power✅ >100 kW/rack✅ waterless✅ good✅ no fluid replacement, ITE-safe✅ low-flow/low-leak; ⚠ newer

The read. Lowest capex / best retrofit / zero water = L2A (and waterless two-phase). Best efficiency at scale = L2L with warm-water free cooling — at the cost of a water plant, WUE, and a large central single-point-of-failure that makes N+1 mandatory. Maintenance burden is the many-small-vs-few-large axis: in-rack and L2A multiply units and leak surface area; in-row and L2L concentrate into fewer serviceable units but raise SPOF. Two-phase is the emerging density play (≈1/10 the flow → smaller pumps, waterless), tempered by expensive dielectric fluid (two-phase fluids can exceed $50/L) and lower maturity.

Publishing caveats. *The "~35% lower OpEx" and "8–17% lower 5-yr TCO" (Accelsius/Jacobs) and "six-nines" (CoolIT) figures are vendor/sponsored — attribute by name VENDOR. The 25 ppm chloride, pH band, and 2–3→8–10 yr fluid-life figures are industry consensus, not a single named standard REPORTED. The most consistently corroborated numeric practices are 50 µm/25 µm filtration and the 2–3 °C-above-dew-point rule.
Disclaimer. Independent educational reference built from publicly available vendor pages, ASHRAE TC 9.9, OCP and DMTF Redfish material. Source tags distinguish published standards from vendor claims and widely-reported-but-unquantified practice. Several primary PDFs (ASHRAE Sept-2024 resiliency bulletin, DMTF DSP2064, some vendor datasheets) are paywalled/bot-blocked — verify load-bearing numbers against the live source before using them in a design package. Not procurement, safety, or engineering advice.