For most of the past three years, the story of AI infrastructure has been written almost entirely in GPU. NVIDIA’s H100, its successor the B200, and the Blackwell Ultra — these are the chips that trained GPT-4, that run the large language models powering everything from customer service chatbots to scientific research tools, and that have made NVIDIA the world’s most valuable semiconductor company and, briefly in 2025, the most valuable company of any kind on earth. The GPU is the engine of the AI era. That is the received wisdom. And it is, for the specific task of training large neural networks, broadly correct.
What is becoming increasingly apparent — and what a landmark deal announced on Friday, April 24 between Meta Platforms and Amazon Web Services makes concrete — is that the next phase of AI deployment does not primarily require GPUs. It requires something different, something cheaper, something that handles not the act of building intelligence but the act of deploying it at billions of interactions of scale. It requires CPUs. And the deal Meta has just struck for tens of millions of AWS Graviton5 CPU cores, valued at billions of dollars over multiple years, is the clearest single statement yet that the AI industry’s centre of gravity is shifting.
The Deal: What Was Agreed and at What Scale
The announcement is specific in its commercial parameters. Meta will use AWS Graviton5 central processing units — Amazon’s fifth-generation custom ARM-based server chip, built on a 3-nanometer manufacturing process and containing 192 Neoverse V3 cores per chip — to power CPU-intensive workloads across its agentic AI services. The deployment begins with tens of millions of Graviton5 cores. CNBC reported the contract covers hundreds of thousands of individual chips over at least three years. An AWS executive told Reuters the deal would be worth billions of dollars. With the option to expand as Meta’s AI capabilities grow, the agreement represents one of the largest CPU procurement commitments in the history of cloud computing.
Santosh Janardhan, Meta’s head of infrastructure, said: “AWS has been a trusted cloud partner for years, and expanding to Graviton allows us to run the CPU-intensive workloads behind agentic AI with the performance and efficiency we need at our scale.”
Nafea Bshara, AWS vice president and distinguished engineer, told reporters that Meta chose Graviton5 “for price performance” despite having “access to so many options from the supply side.”
That last phrase — “access to so many options from the supply side” — is the most revealing sentence in the announcement. Meta is one of the largest buyers of compute infrastructure on earth. It has active relationships with NVIDIA, AMD, Arm, CoreWeave, Nebius, and Broadcom. It is building its own internal training and inference accelerator chip. It has been among the first to deploy NVIDIA’s standalone Grace CPUs at scale and has announced plans to adopt NVIDIA’s upcoming 88-core Vera CPUs. The fact that Meta, with all of those options available, chose to commit billions of dollars to Graviton5 is not a passive procurement decision. It is a deliberate architectural statement about what it believes agentic AI compute actually requires.
Why CPUs and Why This Specific CPU
The Graviton5 is architecturally well-suited to the workloads Meta is describing, and understanding why requires understanding what agentic AI actually demands of compute hardware.
Training a large language model — the task that made GPU infrastructure famous — is fundamentally a parallel mathematics problem. It requires thousands of processors performing the same floating-point calculations simultaneously, at maximum throughput, for weeks. GPUs excel at this because they are designed for exactly this kind of massively parallel numerical computation.
Deploying an AI agent — the task that Meta is building toward — is structurally different. An AI agent does not perform one calculation billions of times. It performs many different calculations in sequence: it reads context, decides on an action, calls an external tool or API, processes the result, decides on a next step, manages memory across a long interaction, and coordinates with other agents working on parallel sub-tasks. This is closer to the work of a general-purpose processor than to the work of a GPU. Amazon’s public pitch says Graviton5 is built for CPU-heavy agentic workloads: real-time reasoning, code generation, search, and multi-step orchestration.
The Graviton5’s specific technical characteristics reinforce its suitability for these workloads. Built on a 3nm process, the chip features 192 Neoverse V3 cores, a cache five times larger than the previous generation, and up to 25% better performance. The redesigned cache architecture cuts core-to-core communication latency by up to 33%, and the chip introduces DDR5 memory and bfloat16 support, enabling better efficiency for AI inference and vector workloads.
The latency reduction is particularly significant for agentic AI. When an AI agent is orchestrating a multi-step task — coordinating sub-agents, managing tool calls, maintaining state across a long session — the speed at which different processing cores can communicate with each other directly affects the responsiveness of the overall system. A 33% reduction in inter-core latency at 192 cores per chip, deployed across tens of millions of cores, compounds into a material throughput advantage for exactly the kinds of workloads Meta is describing.
What This Means for AWS and for NVIDIA
For Amazon, the deal is a public proof point of existential importance for its custom silicon strategy. Ultimately, the Meta deal is allowing Amazon to showcase a huge AI customer as a proving point for its homegrown CPUs — chips that compete with NVIDIA’s new Vera CPU, which is also ARM-based and designed to handle AI agentic workloads.
AWS has been developing its Graviton line since 2018, iterating through five generations of increasingly capable custom silicon. The programme has always been strategically rational — building its own chips allows AWS to optimise price-performance for its specific cloud workloads without depending on Intel or AMD’s product roadmaps. But the Meta deal transforms Graviton from an internal efficiency programme into something more commercially significant: the chip that the world’s most aggressive AI company chose, from a menu that included nearly every major semiconductor provider, for its most important next-generation workload.
That competitive framing matters. These chips compete with NVIDIA’s Vera CPU, which is also ARM-based and designed to handle AI agentic workloads. NVIDIA has been extending its reach beyond GPUs precisely because it recognises that the agentic AI era requires a different compute stack. The Meta-AWS deal confirms that AWS is competing for that same territory — and winning a reference customer of unambiguous scale in the process.
For Intel and AMD, the deal carries a different and more concerning message. Both companies have staked significant parts of their data centre roadmaps on the assumption that the AI era would drive demand for their x86-architecture server processors. The Graviton5 is ARM-based. Meta’s own internal chip programme is ARM-based. NVIDIA’s Vera CPU is ARM-based. The architectural shift from x86 to ARM in the data centre — which has been a thesis among semiconductor analysts for several years — is hardening into documented procurement reality.
Meta’s $200 Billion Compute Procurement Campaign
The AWS deal is one component of a procurement strategy that, when assembled in full, is staggering in its scale. The deal is one piece of a procurement campaign exceeding $200 billion across NVIDIA ($50 billion), AMD ($60 billion), CoreWeave ($35 billion), Nebius ($27 billion), Broadcom through custom MTIA silicon through 2029, and now Amazon — reflecting Meta’s conclusion that its AI compute demand exceeds what any single supply chain can deliver.
That conclusion — that no single supplier can meet the scale of what Meta needs — is itself an extraordinary statement about the dimensions of the AI infrastructure buildout underway. Meta has 24.3 billion users across its family of apps. Its AI systems — the recommendation algorithms serving Facebook and Instagram, the generative AI features in WhatsApp and Messenger, the Llama family of open-source models, and the agentic AI systems now being built on top of them — operate at a scale of interaction that has no direct comparable in the history of consumer technology.
In March, Arm disclosed that it worked closely with Meta to design its first branded data centre chip, the “AGI CPU,” which packs 136 Neoverse V3 cores into a 300-watt package. That chip won’t reach Meta’s data centres until later this year, but analysts note that the architectural similarities between Arm’s AGI CPU and Amazon’s Graviton5 mean Meta can run workloads on AWS in the interim and then bring them in-house once Arm’s silicon is ready.
That sequencing — Graviton5 as a bridge to Meta’s own ARM-based custom silicon — may be the most strategically interesting dimension of the deal. Meta is not simply buying compute capacity from AWS. It is potentially running its agentic AI workloads on architecturally compatible chips while its own custom silicon matures, building operational experience with ARM-based agentic infrastructure that will transfer directly when the in-house chip is ready.
In that reading, the Meta-AWS deal is as much a product development strategy as it is a procurement decision. And what it is developing is the compute stack for the next phase of AI — the agentic phase, where intelligence is not simply generated but persistently deployed, reasoning and acting and coordinating across billions of simultaneous user interactions at once.
The GPU built the models. The CPU will run the agents. That is what Friday’s deal is telling the market.
Written by Shalin Soni, CMA specializing in financial analysis, global markets, and corporate strategy, with hands-on experience in financial planning and analytical decision-making.
ALSO READ
• India’s Digital Rupee Targets Welfare Leakages in Major Subsidy Overhaul
• John Distilleries Founder Signals More Stake Sale to Sazerac as Growth Accelerates
• Deutsche Telekom Shares Slip on Reports of T-Mobile Merger Talks
Source: Based on Reuters and publicly available financial information.