Should it surprise us that AI agents “suddenly acquire new capabilities?” Maybe not, says a study from Stanford researchers.
In “Are Emergent Abilities of LLMs a Mirage?” Schaeffer, Miranda, and Koyejo argue that ’emergent abilities’—skills that appear suddenly and unexpectedly as LLM models scale up—are more a function of how we’re measuring model ability than of “a ghost in the machine:”
A) Nonlinear or discontinuous metrics, such as Accuracy or Multiple Choice Grade, models display sharp and unpredictable jumps in performance. This has led many to believe in the sudden emergence of new abilities.
B) Linear or continuous metrics, like Token Edit Distance or Brier Score, reveals a different picture: a smooth, gradual improvement in performance as the model scales.
Ex: Addition: In the 2022 BIG-bench study, both GPT-3 and LAMDA failed to complete addition math problems when trained on fewer parameters. But when they trained GPT-3 using 13 billion parameters, it could suddenly add. (LAMDA, too, after training on 68 billion parameters.)
Alex Tamkin, research scientist at Anthropic countered in Wired: “This is not the full story. We can’t say that all of these jumps are a mirage. I still think the literature shows that even when you have one-step predictions or use continuous metrics, you still have discontinuities, and as you increase the size of your model, you can still see it getting better in a jump-like fashion.”
This isn’t just a technical squabble–there are real implications here as to how we think about AI agents: Are we more likely to adopt them in our organizations if we see their behavior as being more predictable? Maybe.
Does AI gain surprise (“emergent”) capabilities?


Jared Brickman
Jared Brickman is Vice President of the Center of Excellence at leading software investor Insight Partners, where he advises leaders of the firm’s 500+ portfolio companies on how to scale using AI and automation. Learn More →
Speaking & Interviews
Invite Jared to speak on this and more at your event, on your show, in your publication, etc.
