I didnβt fully understand the technical details in this analysis but Matt Dratch makes the case that the despite massive AI build up, weβre still massively short on compute:
The core finding: even under generous assumptions about hardware efficiency and utilization, we are likely 8β50x short on compute for consumer inference alone in a mature agentic and multimodal world. This is
before accounting for enterprise, sovereign, robotics, or training demand.
Three claims underpin this framework:1. Tokens are the kWh of knowledge work. As price falls, users donβt ask the same questions more
cheaply; they ask orders-of-magnitude richer questions involving tools, memory, video, audio, and
sensors.2. The installed base of frontier AI compute is ~13 GW globally (low 20s by some estimates). The
entire 125 GW of existing data center capacity will eventually convert to accelerated compute, and then
weβll build beyond that.
3. Cluster-level efficiency is 5β10% of chip specs once you account for MFU, power distribution, and
fleet mix.
Join the Conversation
Share your thoughts and go deeper down the rabbit hole