Data Center Power Constraints Drive Shift to On-Device GenAI Processing

Menglin Cao, Director Analyst at Gartner

Menglin Cao, Director Analyst at Gartner, highlights that the increasing power demands of generative AI require a transition from data centers to on-device processing. This shift boosts energy efficiency, user experience, and data privacy while mitigating operational constraints in IT infrastructures, leading to more efficient and secure AI processing on endpoint devices.

The rapid expansion of generative AI (GenAI) and large language models (LLMs) has led to an unprecedented demand for computing power. As these AI models grow in complexity, the energy required to support them has placed a significant strain on data centers. Gartner estimates that the power required for data centers to run incremental AI-optimized servers will reach 500 terawatt-hours (TWh) per year in 2027, more than double the levels seen in 2023. This escalating GenAI power demand poses substantial challenges for data center operations, affecting cost, performance, and sustainability.

Gartner forecasts that by 2027, 40% of existing AI data centers will face operational constraints due to power availability. This situation not only impacts the data centers themselves but also has downstream effects on their customers and end users, who may experience increased costs and reduced performance.

The increasing power requirements for GenAI are becoming a critical constraint for IT organizations as well, limiting their ability to deploy GenAI-related products and applications.

Shift to On-Device GenAI Processing
The operational risks associated with data centers’ increasing power consumption will force product leaders to consider offloading from data centers and moving more AI inference workloads to endpoint devices in the future. There are already two strong motivators for on-device GenAI processing: increased responsiveness and data privacy. Now, with the added pressure of data center power limitations, on-device GenAI processing is becoming an even more attractive solution.

Gartner anticipates that by 2026, more GenAI queries will be processed on-device than in the cloud, signalling a significant shift in AI strategy.

On-Device GenAI Processing Necessitates Redesigning Key Technologies
As the landscape evolves, product leaders must reassess their AI strategies to accommodate this shift. Evaluating the best inference approach for distributing GenAI processing workloads on-device is crucial. By embracing on-device GenAI processing, organizations can mitigate the risks associated with data center power constraints while enhancing the overall user experience. This strategic pivot not only addresses current power challenges but also positions organizations to better meet future demands amid rapidly advancing AI landscape.

The trend toward GenAI processing on endpoint devices, including smartphones, PCs, tablets, XR headsets, wearables, vehicles, robotics, and IoT devices, is gaining momentum, driven by the need for improved user experiences, such as enhanced data privacy, lower latency, and faster response times. On-device GenAI processing requires very high energy efficiency due to the limited form factor of endpoint devices. Additionally, the operation time and battery life of endpoint devices should not be compromised because of additional GenAI features. On-device GenAI processing will require combined, significant improvements in semiconductor, battery and AI model development.

  • Semiconductors: Energy-efficient chips are essential for real-time processing and lower latency. Specialized AI processors and lower-power memory chips, as well as neural processing unit (NPU)-integrated application processors and microcontroller units (MCUs), are preferred for on-device GenAI. Wide-bandgap semiconductors, such as gallium nitride (GaN), play a crucial role in power conversion for fast chargers, significantly enhancing user experience. Fast chargers are essential for quickly recharging battery-powered endpoint devices, a key factor in user experience for GenAI on smartphones, PCs, and other personal devices, as local GenAI processing can rapidly deplete battery life.
  • Batteries: Most endpoint devices such as smartphones, PCs, XR headsets or even wearables will be battery-powered, and on-device GenAI processing will consume more energy from these devices. Enhanced energy density batteries, such as solid-state lithium-ion, will be critical for supporting longer operation times.
  • AI Models: Tailored AI models with smaller parameter sets are needed for local processing on endpoint devices. Light LLMs, with fewer parameters, are suitable for specific tasks and sectors, reducing computational requirements and making them appropriate for endpoint devices where a standard LLM (which could be considered “heavy”) is infeasible.

 

Lost Password