Patrick Smith at Pure Storage explains why AI requires extensive energy supplies, and describes the role of data storage in keeping this demand in check.
AI is disrupting nearly every industry, even its own. In early 2025, a Chinese large language model (LLM), DeepSeek, briefly displaced ChatGPT in the public discourse, sparking speculation over a shifting AI power balance and contributing to volatility in tech markets. Around the globe, nations have declared their intentions to become AI superpowers, while hyperscalers are projected to spend $1 trillion on AI-optimised infrastructure by 2028.
Enterprises, too, are investing heavily. In Asia, IDC found that the region’s top 100 companies plan to allocate 50% of their IT budgets to AI. Yet, not all projects succeed - Gartner reports that nearly a third of AI initiatives fail to deliver business value.
It’s clear that the AI gold rush cannot be ignored. It takes significant investment to participate. So, how do organisations maximise the chances of success for AI projects, and what considerations need to be made for the underlying infrastructure?
AI’s demands on compute and storage
AI workloads fall into two broad categories: training - when a model learns from a dataset - and inference, when it applies what it’s learned to new data. However, critical steps are taken even before training, including data collection, preparation, and curation. The nature of this data varies widely, from archive data to structured transactional databases, often with unpredictable data governance.
What is consistent is that AI is resource intensive. The energy consumption and the voraciousness of GPU processing during training are well-known. Frequent checkpointing during training only adds to the demands on infrastructure. These checkpoints ensure model recoverability, rollback capability, and compliance, further increasing data storage capacity needs and associated energy consumption.
Retrieval-augmented generation (RAG), which integrates internal datasets into LLMs, introduces additional storage complexity, relying on vectorised data - datasets translated into high-dimensional vectors to enable similarity comparisons. This transformation can inflate the dataset size significantly, sometimes by a factor of 10.
Post-training, inference generally requires less compute power but still involves ongoing data storage - both for logging results and the data itself being analysed.
Power, scale, and trade-offs
AI’s growing energy footprint is another critical factor. Some sources have it that AI processing takes north of 30x more energy to run than traditional task-oriented software and that data centre energy requirements are set to more than double by 2030. At the rack level, power usage has jumped from under 10kW to 100kW, or even more in some AI clusters, driven largely by the demands of high-performance GPUs.
This introduces a trade-off: every watt used by data storage is a watt not available to GPUs. Efficient, high-performance storage is essential to feed data to GPUs at pace while minimising the strain on already constrained power budgets. Data storage can also deliver additional performance gains, for example, through key value caches which hold onto frequently accessed data, prompts, and conversations to reduce repetitive GPU processing. Cached information can improve responsiveness, even for high-frequency workloads like RAG, trading, and chatbots. Overall, caching can accelerate inference by up to 20 times, maximising GPU efficiency, reducing costs, energy consumption, and empowering scalable and responsive enterprise AI applications.
Storage must keep up
The role of data storage in AI infrastructure is to provide high-throughput, low-latency access to large datasets. Poor storage performance can create GPU bottlenecks, undermining the value of expensive compute hardware.
AI workloads typically require hundreds of terabytes, if not petabytes of capacity, and the ability to retrieve data rapidly - whether for training new models, running inference, or integrating fresh data sources. This applies not only to real-time needs, but also to archival data that may be reused or reprocessed. High-density QLC flash has emerged as an ideal solution for high-performance AI storage needs, due to its combination of speed, capacity, reliability and energy efficiency when used in the right modern storage platform. Use of QLC means customers can store data on flash storage at costs that approach those of spinning disk, but can get to it at the speed essential for AI workloads.
Integrated AI-ready infrastructure
Some vendors now offer storage systems tailored for AI workloads, including solutions certified to work with Nvidia compute stacks. These may come bundled with optimised RAG pipelines and integrated with Nvidia microservices - simplifying deployment and improving performance consistency.
Strategic infrastructure for AI success
Delivering AI at scale requires more than powerful GPUs. It depends on a foundation of robust, efficient, and responsive infrastructure.
Data storage plays a pivotal role in that foundation. From the earliest stages of data preparation, through training to customer-facing inference, AI workloads depend on fast, scalable, and increasingly energy-conscious storage solutions. Without it, even the best-funded projects risk faltering under the weight of their own complexity.
Patrick Smith is Field CTO EMEA at Pure Storage
Main image courtesy of iStockPhoto.com and quantic69
© 2025, Lyonsdown Limited. Business Reporter® is a registered trademark of Lyonsdown Ltd. VAT registration number: 830519543