Blog: AI Does Not Have a Compute Problem. It Has an Infrastructure Problem

24 Feb

By Dr Vincent Zeng

For the past few years, the AI conversation has been dominated by chips, models and compute performance. Faster GPUs, larger models, bigger numbers.

But that focus is starting to obscure the real constraint on AI growth. Compute is no longer the limiting factor. Infrastructure is.

Where the real disconnect sits

The demand for AI computing is undeniably enormous, and there are genuine capacity constraints today. Wafer fabrication capacity, yield rates and material availability all play a role. Advanced packaging – particularly CoWoS (chip-on-wafer-on-substrate) – is widely recognised as the key bottleneck limiting near-term AI chip supply.

However, the AI chip industry behaves like the wider semiconductor and fast-moving consumer goods sectors. It is a fast-moving, capital-intensive ecosystem. Deep-pocketed players are investing aggressively to remove these constraints, expand capacity and accelerate production.

In other words, the chip supply problem is being addressed at speed.

The same cannot be said for the infrastructure required to power and cool this compute once it exists.

Why compute is no longer the limiting factor

To understand the scale of the mismatch, consider deployable computing capacity rather than individual chips.

Using Nvidia’s Blackwell GPUs as an example, manufacturing output has reached the point where deployable AI compute capacity is scaling at the megawatt-per-day level. That equates to roughly the power footprint of a 100 MW data centre every month.

This rate of growth creates a fundamental challenge for a very different ecosystem: the power network.

Unlike semiconductor manufacturing, power generation and grid infrastructure are slow, highly regulated and largely inherited from designs that trace back to Edison’s era. They are not built to scale at anything like the pace AI now demands.

Power delivery and cooling are now the constraint

AI growth is no longer constrained by how much compute we can build, but by where and how that compute can be powered and cooled.

High-density AI systems place extraordinary demands on electricity supply. Power delivery becomes complex. Cooling loads rise sharply. Grid connection timelines stretch from years into decades.

A single example illustrates the mismatch clearly. The UK’s Hinkley Point C nuclear plant Unit 1, rated at 1.6 GW, began construction in 2016 and is currently expected to connect to the grid sometime between 2029 and 2031. From first shovel to usable power, the timeline exceeds a decade.

AI compute capacity, by contrast, is being manufactured and deployed on a monthly cadence.

The result is unavoidable: AI growth is significantly outpacing power generation growth.

When infrastructure cannot keep up

When infrastructure lags behind AI performance gains, the consequences are structural rather than incremental.

These are not problems that can be solved with marginal efficiency improvements or incremental upgrades. They are baked into the architecture of how power is generated, distributed and delivered.

If organisations continue to treat infrastructure as a secondary consideration – something to be addressed after models and hardware – they face real risks:

Inability to deploy compute where it is needed
Long grid-connection queues delaying projects
Escalating operational costs driven by inefficient power conversion and cooling
Physical limits on achievable power density within data centres

At scale, these constraints become blockers, not inconveniences.

Why the power system structure needs to change

To see an alternative path forward, it is worth revisiting an idea that predates the modern AC grid: Edison’s original DC network concept.

Applied in a modern context, this approach looks very different from today’s centralised, AC-heavy power architecture:

Localised distributed generation – such as solar, natural gas, wind or fuel cells – directly feeds DC loads, supported by local energy storage to manage power swings and minimise grid impact.
Distributed generation integrates naturally with DC distribution.
Power conversion stages are reduced, cutting energy losses caused by repeated, unnecessary conversions.
Smart power converters manage power flow and enable independent “islands” that can operate with minimal grid interaction, reducing grid-connection queue times.
Rack structures and data-centre power connections are simplified.
Higher power density becomes achievable through high-frequency, high-efficiency power conversion.

This is not a theoretical exercise. It is a direct response to the physical realities of scaling AI infrastructure.

The direction of travel is clear

These dynamics help explain why Nvidia is now proposing an 800 V high-voltage DC (HVDC) architecture to power its AI factories.

The shift is not about novelty or preference. It is about aligning power infrastructure with the realities of modern compute.

AI does not have a compute problem. The industry has proven it can build chips and systems at extraordinary speed.

What AI has is an infrastructure problem – and until power delivery and cooling architectures evolve, that problem will increasingly define the limits of AI growth.

Isabella Griffiths

Blog: AI Does Not Have a Compute Problem. It Has an Infrastructure Problem

Blog: The AC DC debate is back - and AI is the reason

ZAP Power joins Nvidia Inception Programme

ZAP POWER Ltd