Source: The Register
AI superintelligence is a Silicon Valley fantasy, Ai2 researcher says
You want artificial general intelligence (AGI)? Current-day processors aren’t powerful enough to make it happen and our ability to scale up may soon be coming to an end, argues well-known researcher Tim Dettmers.
“The thinking around AGI and superintelligence is not just optimistic, but fundamentally flawed,” the Allen Institute research scientist and Carnegie Mellon University assistant professor writes in a recent blog post. Dettmers defines AGI as an intelligence that can do all things humans can do, including economically meaningful physical tasks.
The problem, he explains, is that most of the discussion around AGI is philosophical. But, at the end of the day, it has to run on something. And while many would like to believe that GPUs are still getting faster and more capable, Dettmers predicts that we’re rapidly approaching a wall.
“We have maybe one, maybe two more years of scaling left [before] further improvements become physically infeasible,” he wrote.
This is because AI infrastructure is no longer advancing quickly enough to keep up with the exponentially larger number of resources needed to deliver linear improvements in models.
“GPUs maxed out in performance per cost around 2018 — after that, we added one off features that exhaust quickly,” he explained.
Most of the performance gains we’ve seen over the past seven years have come from things like lower precision data types and tensor cores — BF16 in Nvidia’s Ampere, FP8 in Hopper, and FP4 in Blackwell.
These improvements delivered sizable leaps in performance, effectively doubling the computational throughput each time the precision was halved. However, if you just look at the computational grunt of these accelerators gen-on-gen, the gains aren’t nearly as large as Nvidia and others make it out to be.
From Nvidia’s Ampere to Hopper generations, BF16 performance increased 3x while power increased 1.7x. Meanwhile, in the jump from Hopper to Nvidia’s latest gen Blackwell parts, performance increased 2.5x, but required twice the die area and 1.7x the power to do it.
While Dettmers contends that individual GPUs are rapidly approaching their limit, he argues that advancement in how we stitch them together will buy us a few years at most.
As we saw with Nvidia’s GB200 NVL72, which increased the number of accelerators in a compute domain from eight GPUs in a 10U box to 72 in a rack scale system, the company was able to deliver a 30x uplift in inference performance and a 4x jump in training performance over a similarly equipped Hopper machine.
“The only way to gain an advantage is by having slightly better rack-level hardware optimizations, but that will also run out quickly — maybe 2026, maybe 2027,” he wrote.
Despite this, Dettmers doesn’t believe the hundreds of billions of dollars being plowed into AI infrastructure today is unreasonable.
The growth of inference use, he argues, merits the investment. However, he does note that, if model improvements don’t keep up, that hardware could become a liability.
Yet rather than focus on building useful and economically valuable forms of AI, US AI labs remain convinced that whoever builds AGI first will win the AI arms race. Dettmers argues this is a short-sighted perspective.

