I didn’t fully understand the technical details in this analysis but Matt Dratch makes the case that the despite massive AI build up, we’re still massively short on compute:

The core finding: even under generous assumptions about hardware efficiency and utilization, we are likely 8–50x short on compute for consumer inference alone in a mature agentic and multimodal world. This is
before accounting for enterprise, sovereign, robotics, or training demand.
Three claims underpin this framework:

1. Tokens are the kWh of knowledge work. As price falls, users don’t ask the same questions more
cheaply; they ask orders-of-magnitude richer questions involving tools, memory, video, audio, and
sensors.

2. The installed base of frontier AI compute is ~13 GW globally (low 20s by some estimates). The
entire 125 GW of existing data center capacity will eventually convert to accelerated compute, and then
we’ll build beyond that.
3. Cluster-level efficiency is 5–10% of chip specs once you account for MFU, power distribution, and
fleet mix.