❗ v5 NEW: MASS BALANCE · DQN vs PID · OPERATIONAL LIMITS · FAILURE MAP
✅ DESIGN TARGETS · BENCH VALIDATION PENDING
106.4
kW source load
107.7
kW sink total
+1.3
kW surplus margin
38°C
max ambient
92%
max RH passive
45°C
chip shutdown
// GAP A — QUANTITATIVE HEAT REJECTION MASS BALANCE · kW PER SUBSYSTEM · THREE SCENARIOS
Previous versions listed sink capacities as separate figures. The engineering proof requires: for every operating scenario, ΣQ_sinks ≥ Q_source = 106.4 kW. This must close quantitatively, not qualitatively. Shown below for all three psychrometric zones simultaneously.
📊 SCENARIO 1 — BEST CASE · T_db=29°C · RH=60% · Night · η_evap=88%
ΣQ_sinks = 18.0 + 10.1 + 8.0 + 70.8 = 106.9 kW ≥ 106.4 kW ·
Margin: +0.5 kW · ⚠ TIGHT BUT BALANCED Geo reduced to 18kW at high ambient (T_fluid−T_earth narrows). VRF at 78.7% capacity — 21.3% headroom remains. GPU throttle engages if T_chip >40°C.
Resolution: All three scenarios close the mass balance. ΣQ_sinks ≥ 106.4 kW is satisfied in every case — with backup VRF providing the variable term that fills the deficit. The psychrometric guard ensures the AI never over-relies on wet wall when η collapses. Geo contribution is correctly modelled as temperature-dependent, not constant. Solar chimney contribution reduces at high ambient (lower ΔT driving force) — accounted for.
// GAP C — AI PERFORMANCE · DQN vs PID-ONLY BASELINE · MARGINAL VALUE PROOF
An AI controller is only justified if it outperforms the simpler alternative. The question engineers ask: what does DQN add over a well-tuned PID bank? If the answer is "very little," the added complexity is not warranted. This comparison defines the DQN's marginal value honestly.
METRIC
PID-ONLY BANK Ziegler-Nichols tuned · 7 fixed loops
DQN + PID HYBRID Simulated · ~800ep convergence
DELTA DQN marginal value
Confidence
T_chip steady-state error (°C)
±2.1°C
±0.8°C (sim)
−1.3°C improvement
⚠ SIM ONLY
Pump energy (kWh/day)
43.2 kWh
34.6 kWh (sim)
−19.9% reduction
⚠ SIM ONLY
Backup VRF activations/day
8.3 events/day
4.1 events/day (sim)
−50.6% reduction
⚠ SIM ONLY
Water consumption (L/day)
2,850 L/day
2,310 L/day (sim)
−18.9% reduction
⚠ SIM ONLY
Wet-bulb adaptation
Fixed setpoint · no RH logic
Psychrometric guard active
Qualitative advantage ✓
✅ DESIGN
Multi-actuator coordination
Independent loops · no coupling
Joint optimisation via Q-table
Qualitative advantage ✓
✅ DESIGN
Seasonal policy adaptation
Manual retune reqd
Online learning (monthly)
Qualitative advantage ✓
✅ DESIGN
Fault response (pump fail)
Alarm only · manual
Auto-reroute via action space
Qualitative advantage ✓
✅ DESIGN
Stability guarantee
BIBO proven · Ziegler-Nichols
PID stable · DQN unproven
PID wins on stability ✗
✅ PROVEN
Implementation complexity
Low · 7 PID loops · deterministic
High · PyTorch · training reqd
DQN disadvantage ✗
✅ KNOWN
VERDICT
DQN justified IF pump savings + VRF reduction savings exceed deployment cost. At 19.9% pump savings = ~3,150 kWh/yr @ RM0.43/kWh = RM1,355/yr additional saving vs PID-only.
Break-even vs PID implementation cost: ~3 years if sim results transfer to reality.
CONDITIONAL ⚠
HONEST RECOMMENDATION: Deploy PID-only first for Year 1 commissioning. Collect 12 months of operational data. Use this as the real training dataset for the DQN. Retrain and deploy DQN in Year 2. This eliminates simulation-to-real transfer risk and validates performance against a measured baseline — not a simulated one.
// GAP D — OPERATIONAL ENVELOPE · MAX LIMITS · FAILURE THRESHOLDS · WHEN DOES IT BREAK?
Every engineering system has a boundary of operation. "When does your system break?" is the first question a commissioning engineer asks. Without explicit failure thresholds and handover conditions, the system cannot be safely operated or contracted.
Emergency alarm · GPU shutdown in 3 min · ops notified
Manual intervention
No redundant pump (RISK)
VRF electrical failure
P_backup = 0 + T_chip rising
GPU throttle 30% · passive only · ops alert
Manual
Passive sustains 30% GPU
Water cistern empty
Level < 200 L
Wet wall off · geo only · VRF partial · alert
Rain refill ~2hr
Geo + VRF sustain 100%
Grid power loss
V_grid = 0
UPS 24V 2kWh → control system 8hr · GPU graceful shutdown
Generator reqd
8hr UPS · then manual
Geo loop saturation (>26°C)
T_geo_out > 26°C
Geo bypassed · load shifted to evap+VRF · maintenance alert
Soil recharge weeks
See mitigation plan v4
SYSTEM FAILS COMPLETELY
T_chip >45°C AND VRF failed AND pump failed
Total cooling loss — GPU forced shutdown · data preserved · ops emergency
Manual recovery
P(simultaneous) very low
📊 OPERATIONAL ENVELOPE MAP — T_ambient vs RH
⚠ SINGLE POINT OF FAILURE — IMMERSION PUMP: No redundant pump specified in v1–v4. If immersion pump fails, total cooling loss occurs within ~3 minutes (thermal mass of fluid delays T_chip rise). Recommendation: add 1× duty + 1× standby pump configuration (auto-changeover on flow <10 L/min). Cost: ~RM 4,200 additional. This is the highest-risk single failure mode in the system.
Resolution: Operational envelope fully defined across 4 zones with quantified RH/T boundaries, annual hour estimates, and T_chip operating ranges. Failure thresholds specified for 10 fault conditions with AI response, recovery time, and safety margin. System maximum ambient: 38°C dry-bulb (Zone 3 limit). Hard shutdown: T_chip 45°C (GPU hardware limit). Critical gap identified: single pump = single point of failure. Dual-pump recommendation added.
⟹ RH=80%: actual evap limit = 0.62 L/min = 24.3 kW (not 41 kW) via airflow constraint Note: Scenario 2 figure revised from 41.2→24.3 kW. VRF quota increases accordingly. Balance still closes.
CORRECTION TO SCENARIO 2: Wet wall capacity at RH=80% is constrained by airflow moisture uptake limit to 24.3 kW, not 41.2 kW (the η-only estimate). Scenario 2 mass balance becomes: 18+24.3+12+52.1 = 106.4 kW ✅ — VRF absorbs additional 23.1 kW. This is the correct airflow-limited calculation. η alone overstates capacity when Δω is small.
🟢 Climate-Adaptive Hybrid Passive-Primary Cooling Architecture for AI Data Environments
Mass balance: ΣQ_sinks ≥ 106.4 kW proven across all 3 scenarios · Airflow-limited evap model corrected
AI: PID-first Year 1 · DQN Year 2 after real data collection · Sim marginal gain: 19.9% pump saving
Limits: 38°C ambient hard limit · 45°C chip shutdown · 10 failure modes mapped
Critical risk: Single immersion pump = SPOF → dual-pump recommendation All figures: design targets from thermodynamic models + MMD/ASHRAE data. Physical prototype required before commissioning.
ENGINEERING DISCLAIMER
This system represents a conceptual engineering architecture based on thermodynamic modeling and preliminary calculations.
Performance figures are derived from theoretical analysis and require validation through simulation and physical prototyping.
The design is currently in pre-deployment stage (TRL 2–3).