The Carbon Cost of Intelligence:
AI's Energy Problem
Training a single frontier AI model can generate as much CO₂ as five cars over their entire operational lifetimes. Inference compounds that cost daily at scale. Here is the honest accounting.
In 2019, Emma Strubell and colleagues at the University of Massachusetts Amherst published a paper that sent a jolt through the AI research community.[1] They had calculated the carbon footprint of training a single large NLP model — a transformer with neural architecture search — and found it produced approximately 626,000 pounds of CO₂ equivalent. That is roughly five times the lifetime emissions of an average American car, including its manufacture.
The paper was published before GPT-3, before the current era of trillion-parameter models trained on clusters of thousands of accelerators running for weeks or months. The numbers since then have not improved in absolute terms. They have grown.
The two-part carbon problem
AI's energy consumption splits into two distinct categories with different magnitudes and different trajectories: training and inference.
Training is expensive and infrequent. A frontier model is trained once — or periodically updated — consuming vast compute over weeks. Patterson et al. (2021) at Google estimated the carbon footprint of training GPT-3 at approximately 552 tonnes of CO₂e, depending on the energy source.[2] More recent frontier models almost certainly exceed this figure, though the major labs have largely stopped publishing training costs in ways that allow independent verification.
Inference is cheap per query and enormous in aggregate. Every time a user sends a message to a deployed language model, a data centre processes it. ChatGPT, at its 2023 peak, was handling an estimated 10 million queries per day. At typical inference energy costs, that aggregates to significant continuous power draw. The International Energy Agency estimated in 2024 that AI data centres could account for 1–1.5% of global electricity consumption by 2026, potentially doubling to 3% by 2030.[3]
Where the energy actually goes
The energy intensity of large model training and inference is dominated by the compute hardware — primarily GPUs and, increasingly, specialised AI accelerators like Google's TPUs and custom chips from Amazon, Microsoft, and Meta. Modern data centre GPUs (NVIDIA H100, H200) consume between 300–700W each under load. A single training run for a frontier model may use 10,000–30,000 such GPUs simultaneously.
Cooling accounts for a further 30–40% of total data centre energy consumption. The Power Usage Effectiveness (PUE) metric — total facility power divided by IT equipment power — has improved substantially over the past decade, with hyperscale data centres now achieving PUE values below 1.2. But improvements in efficiency are being outpaced by growth in scale.
Water consumption is a less-discussed but equally significant concern. Data centres use water for cooling — either directly in cooling towers or indirectly through electricity generation. A 2023 study estimated that training GPT-3 consumed approximately 700,000 litres of freshwater.[4] In water-stressed regions, where data centre construction has accelerated due to land availability and lower electricity costs, this is a material environmental impact.
The efficiency counter-argument
The straightforward framing of AI's energy problem invites a straightforward counter-argument: AI also enables energy savings. Climate modelling, grid optimisation, materials science for battery development, smart building management — the applications of AI to energy efficiency and climate science are genuinely significant.
Google has reported using DeepMind's AI to reduce cooling energy in its data centres by 40%.[5] AI-assisted weather forecasting models have achieved performance matching traditional numerical weather prediction at a fraction of the compute cost. The IPCC and other climate bodies have identified AI as a potentially significant tool in climate mitigation, though they also note the emissions from AI infrastructure as a countervailing factor.[6]
The honest assessment is that both things are true: AI consumes significant and growing energy, and AI enables significant and growing energy efficiency. The net sign of the impact depends on the specific application, the energy source, and the counterfactual — what would have happened without AI. That calculation varies by case and is often not done rigorously.
"The machine learning community has prioritised accuracy over efficiency. Reporting accuracy without reporting energy cost is incomplete science."
— Strubell et al., Energy and Policy Considerations for Deep Learning in NLP, ACL 2019[1]
Efficiency research: doing more with less
Within the AI research community, a growing body of work is focused on reducing the compute and energy requirements of capable models. Key directions include:
- Model distillation — training smaller "student" models to replicate the behaviour of larger "teacher" models, achieving comparable performance at significantly lower inference cost.[7]
- Quantisation — representing model weights in lower-precision formats (4-bit, 8-bit integers rather than 16- or 32-bit floats), reducing memory bandwidth and enabling inference on less powerful hardware.
- Sparse activation — Mixture-of-Experts (MoE) architectures activate only a subset of model parameters for each token, dramatically reducing the compute required per forward pass while maintaining high effective parameter counts.
- Efficient architectures — State Space Models (SSMs) like Mamba offer linear rather than quadratic scaling with sequence length, potentially replacing attention mechanisms for certain use cases at lower energy cost.
The Chinchilla scaling paper from DeepMind (2022) demonstrated that most prior large models were significantly undertrained relative to their parameter count — the optimal training regime requires far more data than was commonly used.[8] Paradoxically, this finding implies that more energy-efficient models can be trained by using smaller models with more tokens, rather than larger models with fewer tokens.
The renewable energy question
The carbon impact of AI compute is a direct function of the carbon intensity of the electricity grid it runs on. A data centre powered entirely by renewable energy has a fundamentally different carbon footprint from one drawing on a coal-heavy grid — even if the raw energy consumption is identical.
The major hyperscalers have made substantial commitments to renewable energy purchasing. Microsoft, Google, and Amazon all report 100% renewable energy matching for their data centre operations, though the accounting methodology — time-matching, geographic matching, additionality — matters considerably and is not standardised.[9]
The growth in AI-driven data centre demand is creating new pressures. Several major AI operators have announced or renewed interest in nuclear power — specifically small modular reactors (SMRs) — as a source of reliable, low-carbon baseload power. Microsoft signed an agreement to restart the Three Mile Island nuclear plant in 2023. Whether this represents genuine decarbonisation or carbon accounting that obscures continued dependence on grid power of varying quality is an active debate.
What accountability looks like
The minimum standard of accountability the research community has converged on — and which the Strubell paper originally advocated — is mandatory reporting of energy consumption and carbon emissions alongside accuracy metrics in published research. This norm has been partially adopted but is not universal.
At the deployment level, transparency about the energy and water cost of AI inference is essentially absent from consumer-facing products. A user interacting with a language model has no indication of the resource cost of their query. Whether this constitutes a genuine accountability gap or is simply an engineering detail below the level of user-facing relevance is contested — but it is notable that the industry that spent a decade emphasising the environmental credentials of cloud computing relative to on-premises hardware has been considerably less forthcoming about the environmental footprint of its AI workloads.
References
- Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and Policy Considerations for Deep Learning in NLP. ACL 2019. arXiv:1906.02629. arxiv.org/abs/1906.02629 ↩
- Patterson, D. et al. (2021). Carbon Considerations for Large Language Models. arXiv:2104.10350. arxiv.org/abs/2104.10350 ↩
- International Energy Agency. (2024). Electricity 2024: Analysis and Forecast to 2026. IEA. iea.org/reports/electricity-2024 ↩
- Li, P. et al. (2023). Making AI Less "Thirsty": Uncovering and Addressing the Secret Water Footprint of AI Models. arXiv:2304.03271. arxiv.org/abs/2304.03271 ↩
- DeepMind. (2016). DeepMind AI Reduces Google Data Centre Cooling Bill by 40%. DeepMind Blog. deepmind.google ↩
- IPCC. (2022). Climate Change 2022: Mitigation of Climate Change. Contribution of Working Group III to the Sixth Assessment Report. Cambridge University Press. ipcc.ch/report/ar6/wg3/ ↩
- Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the Knowledge in a Neural Network. arXiv:1503.02531. arxiv.org/abs/1503.02531 ↩
- Hoffmann, J. et al. (2022). Training Compute-Optimal Large Language Models. arXiv:2203.15556. arxiv.org/abs/2203.15556 ↩
- Lannquist, Y. (2023). AI and Climate: Measuring the Environmental Impact of AI. Partnership on AI. partnershiponai.org ↩