Power and cooling (the inconvenient physics)

Modern AI infrastructure has discovered what the Alchemists’ Guild learned centuries ago, which is that performing impressive feats requires consuming alarming quantities of energy and produces heat in proportions that make everyone nearby uncomfortably aware of the Second Law of Thermodynamics. A state-of-the-art data centre training large language models consumes electricity at rates comparable to small towns while generating enough waste heat to keep said town warm through winter, assuming you could efficiently distribute the heat rather than desperately trying to remove it before your expensive hardware melts.

The physics is straightforward and unhelpful. Every computation requires energy. That energy ultimately converts to heat. The more computation you perform, the more heat you generate. AI training involves performing truly staggering amounts of computation, which means truly staggering amounts of heat that must be removed continuously or your multi-million euro GPU cluster becomes a very expensive sculpture of partially melted silicon and regret.

This wouldn’t be particularly concerning if AI workloads were modest, occasional, or conducted in locations where abundant renewable energy and convenient heat disposal were available. Unfortunately, AI workloads are massive, continuous, and concentrated in data centres that were already pushing local power infrastructure limits before anyone decided to fill them with thousands of GPUs each consuming 700 watts while training models around the clock. The result is an infrastructure challenge that combines electrical engineering, thermodynamics, environmental policy, and substantial amounts of money, none of which have simple solutions.

Power consumption at scale

A single Nvidia H100 GPU consumes approximately 700 watts at full load. This seems manageable until you remember that training frontier AI models requires thousands of these GPUs running simultaneously for weeks or months. A training cluster of 10,000 H100s consumes 7 megawatts just for the GPUs, before accounting for CPUs, networking, storage, cooling, and facility overhead. Total facility power consumption typically runs at about twice the IT equipment load, bringing the cluster to around 14 megawatts.

For context, 14 megawatts is enough to power approximately 10,000 to 15,000 homes continuously. A single AI training run consumes electricity that would otherwise serve a small town. Multiple training runs, inference workloads, and other data centre operations scale this up substantially. Large AI-focused data centres consume 50 to 100 megawatts or more, which approaches the output of small power stations and definitely exceeds what local electrical grids were designed to provide.

The economics are significant. At industrial electricity rates of approximately €0.08 to €0.15 per kilowatt-hour, a 14 megawatt facility consumes roughly €10,000 to €18,000 worth of electricity per hour, €240,000 to €432,000 per day, or €88 million to €158 million annually. Power costs become a substantial component of total operating expenses, sometimes exceeding hardware depreciation as the dominant ongoing cost.

Securing reliable power supply for these facilities requires negotiating with utilities, sometimes building dedicated substations, and occasionally explaining to local authorities why you need enough electricity to power a small town for activities that produce no tangible goods and aren’t obviously essential to human welfare. These conversations apparently don’t always go smoothly, which is why major tech companies increasingly build data centres in locations with abundant cheap power rather than near major population centres where the electricity is needed for less exotic purposes.

The growth trajectory is concerning. AI workloads are increasing as models grow larger and as more organisations deploy AI services. Each generation of AI accelerators consumes more power than the previous generation, which is partially offset by improved efficiency but still results in absolute power consumption growing substantially. Projections suggest AI data centres could consume several percent of global electricity production within a decade if current growth continues, which raises questions about whether this is sustainable, sensible, or compatible with climate commitments.

The thermodynamics problem

Heat removal is not optional. Electronics have maximum operating temperatures beyond which they fail, degrade, or catch fire. GPUs running at 700 watts generate 700 watts of heat that must be continuously removed to maintain operating temperatures. This is not a detail that can be optimised away or solved with clever software. It’s fundamental physics expressing its traditional indifference to human preferences.

Air cooling is the traditional approach for data centres. Cold air is blown over hot components, absorbs heat, becomes hot air, and is expelled from the facility. This works adequately for moderate power densities but struggles with modern AI infrastructure where equipment racks can consume 40 to 80 kilowatts. The volume of air required to cool these racks exceeds what conventional air handling systems can provide, and the resulting hot air exits at temperatures that make managing it architecturally challenging.

Liquid cooling is increasingly necessary for high-density AI infrastructure. Water or other cooling fluids circulate through cold plates attached directly to processors, absorbing heat more efficiently than air cooling. This enables higher power densities and more efficient heat removal but requires plumbing infrastructure throughout the data centre, introduces leak risks, and complicates maintenance. It’s engineering that works but adds complexity and cost compared to simpler air-cooled designs.

The heat doesn’t disappear after being removed from the equipment. It’s transferred to cooling water, which must be cooled by evaporation, chillers, or heat exchangers that ultimately reject the heat to the environment. A data centre consuming 50 megawatts of electricity produces 50 megawatts of waste heat that must be continuously dissipated to the atmosphere. This requires cooling towers, chillers, or other heat rejection infrastructure that itself consumes significant power and water.

Water consumption for cooling is substantial, particularly for evaporative cooling which is common in warm climates. A large data centre can consume millions of litres of water daily for cooling, which becomes problematic in regions with water scarcity or during droughts. Some facilities use treated wastewater or seawater for cooling, which helps but introduces corrosion and maintenance challenges. The water consumption has led to conflicts with local communities who question whether training AI models justifies consuming water resources that might otherwise serve residential or agricultural needs.

Sustainability contradictions

The AI industry’s environmental claims frequently conflict with its infrastructure reality. Companies announce carbon neutrality commitments while building data centres that consume electricity at rates previously unimaginable. They purchase renewable energy credits while drawing power from grids that are substantially fossil-fuel based. They discuss efficiency improvements while deploying equipment that consumes more total power than ever before.

Renewable energy procurement is the primary mechanism by which AI companies claim environmental responsibility. They sign long-term contracts with wind and solar projects, purchase renewable energy certificates, or build on-site renewable generation. These arrangements are legitimate accounting mechanisms that support renewable energy development. They don’t change the fact that data centres draw instantaneous power from electrical grids that supply whatever generation is available, which is often fossil fuels during peak demand or when renewables aren’t generating.

The temporal mismatch is fundamental. AI training runs continuously, requiring steady power supply regardless of weather or time of day. Solar generates during daylight, wind generates when it’s windy, and neither aligns perfectly with constant data centre demand. Battery storage could bridge these gaps but adding gigawatt-hours of storage to enable renewable-powered AI infrastructure is expensive and introduces its own environmental concerns about battery production and disposal.

Geographic arbitrage is increasingly common. Data centres locate in regions with abundant hydroelectric power, geothermal energy, or other reliable renewable sources. Iceland, Norway, and parts of Canada have become attractive for data centres because renewable energy is abundant and cold climates reduce cooling requirements. This works for organisations that can tolerate latency from data centres being distant from major population centres, which AI training can but many other applications cannot.

Efficiency improvements are real but insufficient. Each generation of AI accelerators performs more computation per watt than its predecessor. Software optimisations reduce the computation required for equivalent results. These improvements matter and should continue, but they’re overwhelmed by the growth in total AI workload. Training GPT-4 was more efficient per parameter than training GPT-3, but GPT-4 was substantially larger and trained more extensively, so total energy consumption increased dramatically.

The fundamental tension is that AI requires massive computation, computation requires energy, and energy production has environmental consequences. Claiming AI development is environmentally sustainable requires either arguing that the computation is necessary and valuable enough to justify the environmental impact or demonstrating that the energy comes from truly renewable sources that could not be better used displacing fossil fuel consumption elsewhere. Neither argument is obviously true, and both are debated extensively by people with varying degrees of financial interest in AI development continuing unimpeded.

Infrastructure constraints and bottlenecks

Building data centres capable of supporting large-scale AI workloads faces physical limitations that money alone cannot quickly overcome. Electrical grid capacity, cooling water availability, and suitable locations with both are limited resources that constrain how quickly AI infrastructure can expand.

Electrical grid connections capable of delivering tens to hundreds of megawatts require substantial lead time. Utilities must assess capacity, potentially upgrade distribution infrastructure, and ensure that adding a major new load won’t destabilise the grid. In regions where electricity demand already approaches supply, utilities may refuse new large loads or require years of infrastructure development before supply can be increased. This is why data centre projects often announce locations years before facilities become operational.

Cooling water availability limits where high-power data centres can be built. Facilities requiring millions of litres daily for cooling can only be sited near reliable water sources willing to supply at required volumes. Regions with water scarcity, regulatory restrictions on water consumption, or environmental protections on water bodies may not permit new large data centres regardless of their willingness to pay for water access.

Physical space suitable for data centre construction is more abundant than power or water but still involves constraints. Facilities need to be relatively flat, stable terrain with good connectivity to fibre optic networks. They should be in locations with low natural disaster risk, stable political environments, and supportive regulatory frameworks. These requirements eliminate many potential locations even before considering power and water availability.

The concentration of AI infrastructure in specific regions reflects these physical constraints. Certain areas of Virginia, Oregon, and Iowa in the United States have become data centre clusters because they have adequate power, cooling, and connectivity. Similar clusters exist in Ireland, Singapore, and other locations with favourable conditions. This geographic concentration creates competition for limited resources and drives up costs as multiple data centres vie for the same power and water supplies.

Expansion timelines are measured in years. Planning, permitting, construction, and commissioning of major data centres typically takes three to five years from initial planning to operational status. This means organisations must predict their infrastructure needs years in advance and hope their predictions are accurate. In a field developing as rapidly as AI, where capabilities, business models, and competitive dynamics change quickly, predicting requirements years ahead is speculation dressed as planning.

Cost implications and trade-offs

The power and cooling requirements of AI infrastructure create substantial ongoing costs that affect the economics of AI development and deployment. These costs influence which organisations can afford to develop frontier AI models, what business models are economically viable, and whether AI applications that seem technically feasible are commercially practical.

Operating costs for power and cooling typically exceed hardware depreciation for long-lived AI infrastructure. A GPU cluster might cost €100 million to purchase but consume €150 million worth of electricity over a five-year lifespan. The operational costs are recurring and unavoidable as long as the equipment runs. This shifts the economics of AI infrastructure from capital-intensive to both capital-intensive and operationally expensive, which is the least attractive quadrant of infrastructure economics.

Location decisions are increasingly driven by power costs rather than proximity to customers or talent. Building data centres in remote locations with cheap electricity saves operational costs but increases latency for users and complicates recruitment of engineering staff. This trade-off makes sense for AI training which tolerates latency but is less suitable for inference workloads serving latency-sensitive applications.

Inference efficiency becomes critical at scale. Training costs are paid once per model regardless of how many users eventually use the model. Inference costs scale with usage, so inefficient inference becomes prohibitively expensive at scale. This drives substantial engineering effort into inference optimisation, model compression, and techniques that reduce computational requirements per query. The physics of power consumption makes efficiency engineering economically essential rather than just nice to have.

Utilisation rates matter enormously when power is a major cost. GPUs consuming 700 watts that sit idle still consume power for cooling and facility overhead even when not performing computation. Maximising utilisation by keeping hardware busy continuously is economically necessary but operationally challenging. This drives multi-tenancy, batch scheduling, and workload orchestration that ensures expensive hardware is constantly productive rather than waiting for work.

Energy arbitrage opportunities exist for organisations that can tolerate scheduling flexibility. Training models when renewable energy is abundant and cheap rather than during peak demand when electricity is expensive could reduce costs substantially. This requires sophisticated scheduling systems and tolerance for variable training timelines but could provide significant cost advantages. Few organisations currently exploit this opportunity systematically, but economic pressure may drive adoption as power costs increase.

The path forward (such as it is)

The power and cooling challenges facing AI infrastructure have no simple solutions. Physics constrains what’s possible, economics constrains what’s practical, and environmental concerns constrain what’s acceptable. The path forward involves incremental improvements across multiple dimensions rather than breakthrough solutions that eliminate the fundamental tensions.

Hardware efficiency will continue improving. Each generation of AI accelerators performs more computation per watt, which helps but doesn’t eliminate the power consumption problem because total computation keeps increasing. Expecting efficiency gains alone to solve power consumption issues requires believing that efficiency improvements will outpace growth in AI workloads, which hasn’t happened historically and seems unlikely given current AI development trajectories.

Renewable energy adoption will increase as companies respond to environmental pressure and as renewable energy becomes cost-competitive with fossil fuels. This addresses carbon emissions but doesn’t reduce power consumption or eliminate cooling requirements. It makes AI infrastructure more environmentally defensible but doesn’t change the fundamental physics of heat generation and removal.

Cooling technology will advance. Liquid cooling is becoming standard for high-density AI infrastructure. More exotic approaches like immersion cooling, where entire servers are submerged in dielectric fluid, are being deployed in some facilities. These technologies enable higher power densities and more efficient heat removal but add complexity and cost. They’re improvements over air cooling but don’t eliminate the need to dissipate enormous amounts of heat to the environment.

Waste heat utilisation is occasionally discussed but rarely implemented. Using data centre waste heat for district heating, industrial processes, or other applications could improve overall efficiency and provide some economic return from heat that’s currently just disposed. The practical challenges include distance from heat consumers, temperature requirements, and economic viability of heat distribution infrastructure. A few facilities have implemented waste heat utilisation successfully, but it remains rare rather than standard practice.

Regulatory pressure may eventually constrain data centre growth in power or water constrained regions. Governments facing electricity or water scarcity might limit new data centre construction or impose efficiency requirements. Environmental regulations might restrict cooling water consumption or mandate renewable energy usage. These constraints would force the industry to prioritise efficiency and locate facilities in regions with abundant resources rather than optimal network positioning.

The realistic outlook is that AI infrastructure will continue consuming enormous amounts of power, generating corresponding heat, and creating environmental concerns that are managed through renewable energy procurement, efficiency improvements, and geographic arbitrage rather than actually being solved. The power and cooling requirements of AI are fundamental consequences of the physics of computation and cannot be eliminated through software optimisation or algorithmic cleverness. They can only be managed through engineering that accepts the physics and builds infrastructure accordingly, which is expensive, environmentally impactful, and ultimately limited by the availability of power, cooling capacity, and locations where both are abundant.

The AI industry’s environmental footprint will remain a source of controversy as long as AI development continues at current trajectories. Companies will claim sustainability through renewable energy procurement while consuming power at scales that strain electrical grids. Environmental advocates will question whether training increasingly large AI models justifies the environmental impact. Engineers will continue designing infrastructure that pushes the boundaries of power and cooling technology while physics maintains its traditional position of being completely indifferent to everyone’s preferences, budget constraints, or environmental commitments.