Counting Down Capabilities to AGI

Comments (1)

shash42 · 6h ago

This is a living document where I'll track my evolving thoughts on what remains on the path to building generally-intelligent agents. Why does this matter? Three compelling reasons:

Top-down view: AI research papers (and product releases) move bottom-up, starting from what we have right now and incrementally improving, in the hope we eventually converge to the end-goal. This is good, that’s how concrete progress happens. At the same time, to direct our efforts, it is important to have a top-down view of what we have achieved, and what are the remaining bottlenecks towards the end-goal. Besides, known unknowns are better than unknown unknowns.

Research prioritisation: I want this post to serve as a personal compass, reminding me which capabilities I believe are most critical for achieving generally intelligent agents—capabilities we haven't yet figured out. I suspect companies have internal roadmaps for this, but it’s good to also discuss this in the open.

Forecasting AI Progress: Recently, there is much debate about the pace of AI advancement, and for good measure—this question deserves deep consideration. Generally-intelligent agents will be transformative, requiring both policymakers and society to prepare accordingly. Unfortunately, I think AI progress is NOT a smooth exponential that we can extrapolate to make predictions. Instead, the field moves by shattering one (or more) wall(s) every time a new capability gets unlocked. These breakthroughs present themselves as large increases in benchmark performance in a short period of time, but the absolute performance jump on a benchmark provides little information about when the next breakthrough will occur. This is because, for any given capability, it is hard to predict when we will know how to make a model learn it. But it’s still useful to know what capabilities are important and what kinds of breakthroughs are needed to achieve them, so we can form our own views about when to expect a capability. This is why this post is structured as a countdown of capabilities, which as we build out, will get us to “AGI” as I think about it.

*Framework* To be able to work backwards from the end-goal, I think it’s important to use accurate nomenclature to intuitively define the end-goal. This is why I’m using the term generally-intelligent agents. I think it encapsulates the three qualities we want from “AGI”:

Generality: Be useful for as many tasks and fields as possible.

Intelligence: Learn new skills from as few experiences as possible

Agency: Planning and performing a long chain of actions.

Click and read the blog for:

Introduction

…. Framework

…. AI 2024 - Generality of Knowledge

Part I on The Frontier: General Agents

…. Reasoning: Algorithmic vs Bayesian

…. Information Seeking

…. Tool-use

…. Towards year-long action horizons

…. …. Long-horizon Input: The Need for Memory

…. …. Long-horizon Output

…. Multi-agent systems

Part II on The Future: Generally-Intelligent Agents [TBA]

LLM's Illusion of Alignment (systemicmisalignment.com)

Study challenges assumptions of racial attitudes and political identity in U.S. (phys.org)

Ayaan Hirsi Ali: Glastonbury – and the Purge of the Jews (thefp.com)

Golden Dome (missile defense system) (en.wikipedia.org)

Ceasing my use of Creative Commons (rubenerd.com)

Continuous Glucose Monitoring (imperialviolet.org)

WebGL2 Fundamentals (webgl2fundamentals.org)

Use AI to build ecosystems, not just products (atelierlogos.studio)

Land Values and Affordability (reillywood.com)

English and a Translator's Shame (thewire.in)

My home servers are not a homelab (blog.nradk.com)

Claude-Code-Proxy (github.com)

New Ensō – first public beta (untested.sonnet.io)

Army Field Manual FM 3-0 – Operations (October 2022) [pdf] (irp.fas.org)

Why Is Part of Alameda Island in San Francisco? (kqed.org)

Reflections on agentic coding: Magic or Mirage (async-let.com)

What "One Big Beautiful Bill Act" Means for Your R&D Tax Credits (exactera.com)

She Got a Permit for Her Chickens. Now the City Is Fining Her $80k (reason.com)

Project Farm: Digital calipers review (youtube.com)

Context Engineering: A first-principles handbook with the latest research (github.com)

Ask HN: Is the header CSS broken for you?

Assessing and Modelling Temperature Forecasts with R and Stan (articles.foletta.org)

EĿlipsis, a Language Independent Preprocessor (gustedt.gitlabpages.inria.fr)

New Zealand Approved Psychedelic Therapy. He's the Only Doctor Who Can Do It (nytimes.com)

The Chan-Zuckerbergs stopped funding social causes (washingtonpost.com)

On Wanting to Believe (carsengrote.com)

Largest Digital Camera Snaps Its First Photos of the Universe (wsj.com)

The Mysterious Billionaire Behind the OnlyFans Porn Empire (wsj.com)

AI-Generated Psych-Rock Band Rack Up Spotify Streams (stereogum.com)

Show HN: Free AI Thumbnail Tester (based on real YouTube data) (aithumbnail.so)

Use keyword-only arguments in Python dataclasses (chipx86.blog)

Thousands in Norway told they won up to millions in lottery error (bbc.com)

OpenAI reportedly 'recalibrating' compensation in response to Meta hires (techcrunch.com)

Orange Pi Nova Teased with Loongson 2K3000 as Loongson Expands Product Line (linuxgizmos.com)

How to Do Autocomplete (bonsai.io)

Why Extreme Couponers Have Given Up on Coupons (wsj.com)

How to Potty-Train a Co-Worker (nytimes.com)

Ask HN: Which Free Software or Open Source Project Needs Help?

Early Retirement – What Fire Is Absolutely Not About (onboardedhq.substack.com)

Manifold: An open-source tool to run AI models for 90% less (github.com)

Automatic Beyond Belief (automaticbeyondbelief.org)

An Incentive to Label (olshansky.substack.com)

Scientists Uncover New Concerns About Billion-Dollar Heart Drug (scitechdaily.com)

In the beginning was CAOS (1988) (web.archive.org)

The Hard Problem of Prompt Injection (alexcbecker.net)

Show HN: Superclass – Classify Files, PDF, Images, Docx etc. with GPT (github.com)

WorldVLA: Towards Autoregressive Action World Model (arxiv.org)

Show HN: AI-powered tracker of Trump executive orders (tonygaeta.com)

New VPN Service Can't Log Users by Design (torrentfreak.com)

Explosive increase of ticks that cause meat allergy in US due to climate crisis (theguardian.com)

Counting Down Capabilities to AGI

Comments (1)