Are you ready to bring more awareness to your brand? Consider becoming a sponsor for The AI Impact Tour. Learn more about the opportunities here.
The path to achieving artificial general intelligence (AGI), AI systems with capabilities at least on par with humans in most tasks, remains a topic of debate among scientists. Opinions range from AGI being far away, to possibly emerging within a decade, to “sparks of AGI” already visible in current large language models (LLM). Some researchers even argue that today’s LLMs are AGI.
In an effort to bring clarity to the discussion, a team of scientists at Google DeepMind, including Chief AGI Scientist Shane Legg, have proposed a new framework for classifying the capabilities and behavior of AGI systems and their precursors.
“We argue that it is critical for the AI research community to explicitly reflect on what we mean by ‘AGI,’ and aspire to quantify attributes like the performance, generality, and autonomy of AI systems,” the authors write in their paper.
The principles of AGI
One of the key challenges of AGI is establishing a clear definition of what AGI entails. In their paper, the DeepMind researchers analyze nine different AGI definitions, including the Turing Test, the Coffee Test, consciousness measures, economic measures, and task-related capabilities. They highlight the shortcomings of each definition in capturing the essence of AGI.
VB Event
The AI Impact Tour
Connect with the enterprise AI community at VentureBeat’s AI Impact Tour coming to a city near you!
For instance, current LLMs can pass the Turing Test, but generating convincing text alone is clearly insufficient for AGI, as the shortcomings of current language models show. Determining whether machines possess consciousness attributes remains an unclear and elusive goal. Moreover, while failing at certain tasks (e.g. making coffee in a random kitchen) may indicate that a system is not AGI, passing them does not necessarily confirm its AGI status.
To provide a more comprehensive framework for AGI, the researchers propose six criteria for measuring artificial intelligence:
- Measures of AGI should focus on capabilities rather than qualities such as human-like understanding, consciousness, or sentience.
- Measures of AGI should consider both generality and performance levels. This ensures that AGI systems are not only capable of performing a wide range of tasks but also excel in their execution.
- AGI should require cognitive and meta-cognitive tasks, but embodiment and physical tasks should not be considered prerequisites for AGI.
- The potential of a system to perform AGI-level tasks is sufficient, even if it is not deployable. “Requiring deployment as a condition of measuring AGI introduces non-technical hurdles such as legal and social considerations, as well as potential ethical and safety concerns,” the researchers write.
- AGI metrics should focus on real-world tasks that people value, which the researchers describe as “ecologically valid.”
- Lastly, the scientists emphasize that AGI is not a single endpoint but a path, with different levels of AGI along the way.
The depth and breadth of intelligence
DeepMind presents a matrix that measures “performance” and “generality” across five levels, ranging from no AI to superhuman AGI, a general AI system that outperforms all humans on all tasks. Performance refers to how an AI system’s capabilities compare to humans, while generality denotes the breadth of the AI system’s capabilities or the range of tasks for which it reaches the specified performance level in the matrix.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.