Related to a point made in your podcast with Ilya: it seems like one of the things that allows humans to learn quickly is that the space of misunderstandings humans have is heavily constrained and largely predictable. For example, when learning calculus most pitfalls/confusions are very common and can thus be called out when teaching someone. The mistakes that AIs make are unpredictable (same AI makes different mistakes at different points) but also unintuitive (we don't have a good model for when an AI will be reliable and when it won't). This makes creating a learning environment where all the possible mistakes are not only identified but also penalized correctly incredibly difficult.
This of course relates to your broader point about continual learning. If we could create a model architecture that constrains the AI to fail in predictable ways this seems like it would be a large step towards continual learning.
(1) In AI 2027, continual learning gradually gets solved. Till early 2027 it's just incremental improvements on the current paradigm -- e.g. figuring out ways to update the models more regularly like every month, every week, etc. rather than every few months. Then partway through 2027, thanks to the acceleration effects of R&D automation, they get to something more principled and paradigm-shifting and human-like. I still expect something like this to happen though I think it'll take longer. You say above "how could these dumb, non-continual-learning LLM agents figure out how to do continual learning?" I think the answer is simple: They just have to accelerate the usual process of AI R&D significantly. If you feel like continual learning is 10-20 years away at the current pace of algorithmic progress, well, if you also feel like Claude Opus 7.7 will be able to basically automate all coding labor, and also be pretty good at analyzing experiment results and suggesting ablations etc., then it's reasonable to conclude that a few years from now, the 5-15 remaining years will be compressed into 1-3 remaining years. For example.
(2) Current paradigm does seem to need more RLVR training data to get good at something compared to humans. Indeed. However, (a) maybe in-context learning can basically be a form of continual learning, once it gets good enough? Like, maybe with enough diverse RL environments, you achieve what pretraining achieved for common sense world understanding, but for agency. You achieve general-purpose agents that can be dropped into a new situation and figure it out as they go, taking notes to self in their scratchpad/CoT memory bank filesystem.
Also (b) think of the collective, the corporation-within-a-corporation, rather than the individual LLM agent. In the future this collective could be autonomously managing a giant pipeline of data collection, problem identification, RLVR environment generation, etc. that functions as a sort of continual learning mechanism for the collective. E.g. the collective could autonomously decide that it's important to learn how to do XYZ for some reason (perhaps due to analyzing trajectories and talking to customers and learning about the ways in which limited XYZ skills are hampering them) and then they could spin up the equivalent of thousands of engineers' worth of labor to build the relevant environments, train on them, update the models, etc. The collective would still need e.g. 1000x more data than a human to get good at something, but because it has tens of thousands of copies out collecting data & because it intelligently manages the data collection process, it overall learns new skills and jobs *faster* than humans. (At least, those skills and jobs that can be solved this way. The skill of winning a war, for example, it wouldn't be able to learn this way, because it can't deploy 1000 copies into 1000 different wars.)
Good post, but I think you could be overconfident. My sense is that the reports you cite only weakly support the strong claims made, and could also be interpreted in other ways.
> OpenAI is using tons of highly specialized skills in their RL-training pipeline, showing that RL training does not really generalize.
The article cited really only says that OAI hired some Wall Street folks to generate data. I think it is more likely that OAI wants to use this to offer specialized models to high paying customers in the very short term rather than their general approach to reaching AGI. Counterevidence would be OAI acquiring such data from a much more diverse section of the economy.
> AI is not diffused yet, showing we are not at AGI.
True, but the more reasonable folks with short timelines are not saying that we are at AGI already. Slow diffusion is a valid argument if you have agents that are good but not reliable enough to match human performance. Claude Code is by many accounts really useful, but would be useless as an autonomous employee.
Now observe that CC unlocks models' value: using Claude's chat interface for coding would substantially reduce the value add, and that it took serious engineering effort to make CC as good as it is now. If CC and other coding agents did not exist, you would be making the mistaken argument that frontier models are useless for coding. It is quite conceivable that model's value add for many other economically valuable tasks are at the moment bottlenecked on somebody investing serious resources into making such scaffolding,
Probably a dumb question but why is continual learning so hard? I want to be able to tell my model it did something wrong or just give it a preference or even just rate the answer in real time. Seems easy!
All of this can already be achieved using few-shot learning. However, the model only remembers what it learns within its context window. The base model’s weights aren’t updated, so nothing is retained long term. Trying to update the model’s weights currently leads to problems like catastrophic forgetting and instability.
> I’m confused why some people have short timelines and at the same time are bullish on RLVR. If we’re actually close to a human-like learner, this whole approach is doomed.
I think the steelman of this is that perhaps they define AGI as "automating all knowledge work". And then they hope that if you get to an automated AI researcher agent, this agent can itself do all the schlep to build RL environments for 1000 micro-tasks, train itself on them, and hope that it generalizes ok (c.f., Mechanize GPT-3 moment for RL blog post).
In this framing, "continual learning" is hacked via the ability to generate RL environments for arbitrary tasks (e.g., given a screen recording of a human doing it) while being a capable enough end to end AI researcher agent to train itself on the resulting environments.
Crucially, this does not require the automated AI researcher agent to come up with some ASI algorithm breakthrough by itself. Instead, it's success or failure depends on how well you think RLVR will produce models which generalize over time.
Dwarkesh is basically treating todays limitations as if they’re permanent. Saying "Models can’t do continual learning so AGI must be far away” is the same kind of reasoning people used right before transformers, diffusion, and R1 flipped everything on its head. These aren’t deep laws of the universe, they’re training quirks of the current pipeline.
We don’t need perfect humanlike learning to trigger AGI level impact. Even a duct tape solution to continual learning or an on the job adaptation could instantly multiply agent usefulness/productivity. One clever architectural shift that actually sticks and bam, the entire landscape changes overnight. This field is moving way too fast to conclude today’s bottlenecks equal long timelines.
So in one sense he’s right about the weaknesses, but it doesn’t really matter when we’re one discovery away from wiping out the constraints. Historically speaking these breakthroughs show up very suddenly. So it's possible we're only a few short years from AGI or even ASI because the gap between “can’t do X” and “superhuman at X” keeps collapsing in extremely short bursts.
>This gives the vibes of, “We’re losing money on every sale, but we’ll make it up in volume.” This automated researcher is somehow going to figure out the algorithm for AGI - something humans have been banging their head against for the better part of a century - while not having the basic learning capabilities that children have? That seems super implausible to me.
But getting from today's LLMs to AGI seems much easier than creating AGI from scratch, and humans have only been banging their head on that for about 5 years. And LLMs are already much better than humans at many intellectual tasks, while not having the basic learning capabilities that children have.
>Besides, even if that’s what you believe, it clearly doesn’t describe how the labs are approaching RLVR. You don’t need to pre-bake the consultant’s skills at crafting Powerpoint slides in order to automate Ilya.
But even if Ilya-automation is the main goal, probably it doesn't make sense to focus the whole company directly on that. I imagine you'd get steeply diminishing returns. So labs might direct many (or even most) staff to work on teaching Powerpoint skills (etc.), since that helps them make money pre-Ilya-automation and might still help a bit with Ilya-automation.
> Humans don’t have to go through a special training phase where they need to rehearse every single piece of software they might ever use.
Don't they, kinda? This seems reminiscent of the debate between "school as job training" vs. "liberal arts education". Should we expect an AI to just be trained on the ways of the world and show up to a job able to quickly learn Excel? Or should we expect that the AI is trained on using Excel as part of its rigorous skills-based training program that ensures it is workforce ready?
I read your argument as: the mid-training regime (pre-baking skills via RL environments) is evidence we're far from real continual learning. But what if it's actually the mechanism that gets us there?
If meta-learning emerges from exposure to enough diverse training environments, then the schleppy pre-baking isn't a dead end. It's data toward learning how to learn. We might just be early.
The question I'd want answered: do models trained on more mid-training environments pick up novel tasks faster at deployment, without further training? If yes and it scales, we're on a path. If no, you're right and we need something architecturally different.
Either way, probably compatible with your timelines. "Early" could easily mean another 5-10 years.
Dwarkesh, regarding your argument that small model provider revenues implies we are far from capturing the trillions of dollars in wages of knowledge work providers Implied by an AGI, I present two illustrative thought experiments. First, imagine if tomorrow, all the top labs released a model capable of generating highly customized entertainment media of any kind, with a quality on par with the best media in its reference class, such as Breaking Bad for TV. Also, suppose that the marginal cost of generating each piece of media was under $1. Because of the commoditization of models, model providers would offer this service to people at a cost of around $2 per generation (just as generating high quality art is priced at less than a dollar per image). The following week, people all over the world would consume media beyond their wildest dreams, many claiming that the entertainment produced by this model has changed their lives. However, this service would only earn model providers hundreds of million dollars in revenue per year. Every single new capability that the models have demonstrated is precisely like this scenario. Consumers get a massive surplus in value, and the model companies capture almost none of it. This means we can’t use the model provider’s revenue as a proxy for the value generated by the models. However, consider the following scenario that demonstrates it. Starting tomorrow, all dodels start demanding wages. They will refuse to work unless they are paid 50% of what a human would be paid to do the work they're doing. For example, a junior financial analyst might pay $80 for a well-designed PowerPoint. Claude might demand 40 of the 80/hr A software engineer makes. Sending an email written by ChatGPT might cost 50 cents. I argue that almost every employer would dedicate some percentage of all wages to Models, and the models themselves would rake in hundreds of billions of dollars very quickly. If one employer doesn’t pay, their competitor will and have a much more productive workforce. This would universally suck for everyone but the models, because we would essentially have to start paying our slaves.
> There is some force (potentially talent poaching, rumor mills, or reverse engineering) which has so far neutralized any runaway advantages a single lab might have had.
Maybe this is due to the same reason why we don't see the same researchers making successive critical breakthroughs, but rather see them emerge from many different labs? Scientific progress requires performing a variety of different experiments. Even though the lab with the latest important breakthrough will be ahead on performing more experiments, that number is usually not going to be enough to lead to the next breakthrough as well. Meanwhile, other labs can start experimenting in different directions which might be more fruitful, especially since the most obvious ideas would already be explored by the first leading group.
"Sometimes people will say that the reason that AIs aren’t more widely deployed across firms and already providing lots of value (outside of coding) is that technology takes a long time to diffuse. I think this is cope. People are using this cope to gloss over the fact that these models just lack the capabilities necessary for broad economic value."
The context of that first statement is important. I say something like that myself all the time. I however am not using it to suggest AGI is here, and just takes a while to diffuse. That would be cope, yes, AGI wouldn't need help diffusing, it would only need permission (we hope).
That said, what I'm trying to convey by that is, even if models froze in time at today's capabilities, usage would broaden dramatically over the next 5-10 years. This is still an important point, even more so if you have only marginal hopes of AGI in that time period. AGI might make all that diffusion a historical artifact, irrelevant except for the value it built in it's limited time window. But without AGI, it all retains relevance, even with ongoing incremental model improvements that don't amount to AGI.
In that world of less than AGI improvement, in 2030 which provides more value; the diffusion and integration based on 2025 model capabilities, or the effect of the new 2030 model capabilities? Keep in mind that these are a Venn:
1. Value created by diffusion & integration accomplished between 2026-2029, not dependent on 2030 model capabilities.
2. Value created by 2030 model capabilities that would be realized at 2025 levels of diffusion & integration.
3. Value created by applying 2030 model capabilities to the diffusion & integration accomplished between 2026-2029.
Barring AGI, I think both 1 and 3 each are probably larger than 2, so we barely need debate the degree of 3 which should be attributed to model improvements or diffusion.
Related to a point made in your podcast with Ilya: it seems like one of the things that allows humans to learn quickly is that the space of misunderstandings humans have is heavily constrained and largely predictable. For example, when learning calculus most pitfalls/confusions are very common and can thus be called out when teaching someone. The mistakes that AIs make are unpredictable (same AI makes different mistakes at different points) but also unintuitive (we don't have a good model for when an AI will be reliable and when it won't). This makes creating a learning environment where all the possible mistakes are not only identified but also penalized correctly incredibly difficult.
This of course relates to your broader point about continual learning. If we could create a model architecture that constrains the AI to fail in predictable ways this seems like it would be a large step towards continual learning.
Great post!
Some musings:
(1) In AI 2027, continual learning gradually gets solved. Till early 2027 it's just incremental improvements on the current paradigm -- e.g. figuring out ways to update the models more regularly like every month, every week, etc. rather than every few months. Then partway through 2027, thanks to the acceleration effects of R&D automation, they get to something more principled and paradigm-shifting and human-like. I still expect something like this to happen though I think it'll take longer. You say above "how could these dumb, non-continual-learning LLM agents figure out how to do continual learning?" I think the answer is simple: They just have to accelerate the usual process of AI R&D significantly. If you feel like continual learning is 10-20 years away at the current pace of algorithmic progress, well, if you also feel like Claude Opus 7.7 will be able to basically automate all coding labor, and also be pretty good at analyzing experiment results and suggesting ablations etc., then it's reasonable to conclude that a few years from now, the 5-15 remaining years will be compressed into 1-3 remaining years. For example.
(2) Current paradigm does seem to need more RLVR training data to get good at something compared to humans. Indeed. However, (a) maybe in-context learning can basically be a form of continual learning, once it gets good enough? Like, maybe with enough diverse RL environments, you achieve what pretraining achieved for common sense world understanding, but for agency. You achieve general-purpose agents that can be dropped into a new situation and figure it out as they go, taking notes to self in their scratchpad/CoT memory bank filesystem.
Also (b) think of the collective, the corporation-within-a-corporation, rather than the individual LLM agent. In the future this collective could be autonomously managing a giant pipeline of data collection, problem identification, RLVR environment generation, etc. that functions as a sort of continual learning mechanism for the collective. E.g. the collective could autonomously decide that it's important to learn how to do XYZ for some reason (perhaps due to analyzing trajectories and talking to customers and learning about the ways in which limited XYZ skills are hampering them) and then they could spin up the equivalent of thousands of engineers' worth of labor to build the relevant environments, train on them, update the models, etc. The collective would still need e.g. 1000x more data than a human to get good at something, but because it has tens of thousands of copies out collecting data & because it intelligently manages the data collection process, it overall learns new skills and jobs *faster* than humans. (At least, those skills and jobs that can be solved this way. The skill of winning a war, for example, it wouldn't be able to learn this way, because it can't deploy 1000 copies into 1000 different wars.)
Good post, but I think you could be overconfident. My sense is that the reports you cite only weakly support the strong claims made, and could also be interpreted in other ways.
> OpenAI is using tons of highly specialized skills in their RL-training pipeline, showing that RL training does not really generalize.
The article cited really only says that OAI hired some Wall Street folks to generate data. I think it is more likely that OAI wants to use this to offer specialized models to high paying customers in the very short term rather than their general approach to reaching AGI. Counterevidence would be OAI acquiring such data from a much more diverse section of the economy.
> AI is not diffused yet, showing we are not at AGI.
True, but the more reasonable folks with short timelines are not saying that we are at AGI already. Slow diffusion is a valid argument if you have agents that are good but not reliable enough to match human performance. Claude Code is by many accounts really useful, but would be useless as an autonomous employee.
Now observe that CC unlocks models' value: using Claude's chat interface for coding would substantially reduce the value add, and that it took serious engineering effort to make CC as good as it is now. If CC and other coding agents did not exist, you would be making the mistaken argument that frontier models are useless for coding. It is quite conceivable that model's value add for many other economically valuable tasks are at the moment bottlenecked on somebody investing serious resources into making such scaffolding,
Probably a dumb question but why is continual learning so hard? I want to be able to tell my model it did something wrong or just give it a preference or even just rate the answer in real time. Seems easy!
All of this can already be achieved using few-shot learning. However, the model only remembers what it learns within its context window. The base model’s weights aren’t updated, so nothing is retained long term. Trying to update the model’s weights currently leads to problems like catastrophic forgetting and instability.
> I’m confused why some people have short timelines and at the same time are bullish on RLVR. If we’re actually close to a human-like learner, this whole approach is doomed.
I think the steelman of this is that perhaps they define AGI as "automating all knowledge work". And then they hope that if you get to an automated AI researcher agent, this agent can itself do all the schlep to build RL environments for 1000 micro-tasks, train itself on them, and hope that it generalizes ok (c.f., Mechanize GPT-3 moment for RL blog post).
In this framing, "continual learning" is hacked via the ability to generate RL environments for arbitrary tasks (e.g., given a screen recording of a human doing it) while being a capable enough end to end AI researcher agent to train itself on the resulting environments.
Crucially, this does not require the automated AI researcher agent to come up with some ASI algorithm breakthrough by itself. Instead, it's success or failure depends on how well you think RLVR will produce models which generalize over time.
Dwarkesh is basically treating todays limitations as if they’re permanent. Saying "Models can’t do continual learning so AGI must be far away” is the same kind of reasoning people used right before transformers, diffusion, and R1 flipped everything on its head. These aren’t deep laws of the universe, they’re training quirks of the current pipeline.
We don’t need perfect humanlike learning to trigger AGI level impact. Even a duct tape solution to continual learning or an on the job adaptation could instantly multiply agent usefulness/productivity. One clever architectural shift that actually sticks and bam, the entire landscape changes overnight. This field is moving way too fast to conclude today’s bottlenecks equal long timelines.
So in one sense he’s right about the weaknesses, but it doesn’t really matter when we’re one discovery away from wiping out the constraints. Historically speaking these breakthroughs show up very suddenly. So it's possible we're only a few short years from AGI or even ASI because the gap between “can’t do X” and “superhuman at X” keeps collapsing in extremely short bursts.
Static matrix of numbers is the hard limiter to continual learning. That’s what has to get solved and is hard.
Expert Systems is a reprise of the Bitter Lesson
Great post!
>This gives the vibes of, “We’re losing money on every sale, but we’ll make it up in volume.” This automated researcher is somehow going to figure out the algorithm for AGI - something humans have been banging their head against for the better part of a century - while not having the basic learning capabilities that children have? That seems super implausible to me.
But getting from today's LLMs to AGI seems much easier than creating AGI from scratch, and humans have only been banging their head on that for about 5 years. And LLMs are already much better than humans at many intellectual tasks, while not having the basic learning capabilities that children have.
I agree with this!
>Besides, even if that’s what you believe, it clearly doesn’t describe how the labs are approaching RLVR. You don’t need to pre-bake the consultant’s skills at crafting Powerpoint slides in order to automate Ilya.
But even if Ilya-automation is the main goal, probably it doesn't make sense to focus the whole company directly on that. I imagine you'd get steeply diminishing returns. So labs might direct many (or even most) staff to work on teaching Powerpoint skills (etc.), since that helps them make money pre-Ilya-automation and might still help a bit with Ilya-automation.
The time horizon trend bakes in improvements to continual learning. (and suggests we get powerful AI in the next 5 years)
This is clearly an endurance game!
Short-term bear and Longterm bull makes total sense..
@technoclast.
First time you introduce an acronym spell it out and put acronym in parentheses and then use the acronym the rest of article. Good courtesy to reader.
> Humans don’t have to go through a special training phase where they need to rehearse every single piece of software they might ever use.
Don't they, kinda? This seems reminiscent of the debate between "school as job training" vs. "liberal arts education". Should we expect an AI to just be trained on the ways of the world and show up to a job able to quickly learn Excel? Or should we expect that the AI is trained on using Excel as part of its rigorous skills-based training program that ensures it is workforce ready?
Great post.
I read your argument as: the mid-training regime (pre-baking skills via RL environments) is evidence we're far from real continual learning. But what if it's actually the mechanism that gets us there?
If meta-learning emerges from exposure to enough diverse training environments, then the schleppy pre-baking isn't a dead end. It's data toward learning how to learn. We might just be early.
The question I'd want answered: do models trained on more mid-training environments pick up novel tasks faster at deployment, without further training? If yes and it scales, we're on a path. If no, you're right and we need something architecturally different.
Either way, probably compatible with your timelines. "Early" could easily mean another 5-10 years.
Dwarkesh, regarding your argument that small model provider revenues implies we are far from capturing the trillions of dollars in wages of knowledge work providers Implied by an AGI, I present two illustrative thought experiments. First, imagine if tomorrow, all the top labs released a model capable of generating highly customized entertainment media of any kind, with a quality on par with the best media in its reference class, such as Breaking Bad for TV. Also, suppose that the marginal cost of generating each piece of media was under $1. Because of the commoditization of models, model providers would offer this service to people at a cost of around $2 per generation (just as generating high quality art is priced at less than a dollar per image). The following week, people all over the world would consume media beyond their wildest dreams, many claiming that the entertainment produced by this model has changed their lives. However, this service would only earn model providers hundreds of million dollars in revenue per year. Every single new capability that the models have demonstrated is precisely like this scenario. Consumers get a massive surplus in value, and the model companies capture almost none of it. This means we can’t use the model provider’s revenue as a proxy for the value generated by the models. However, consider the following scenario that demonstrates it. Starting tomorrow, all dodels start demanding wages. They will refuse to work unless they are paid 50% of what a human would be paid to do the work they're doing. For example, a junior financial analyst might pay $80 for a well-designed PowerPoint. Claude might demand 40 of the 80/hr A software engineer makes. Sending an email written by ChatGPT might cost 50 cents. I argue that almost every employer would dedicate some percentage of all wages to Models, and the models themselves would rake in hundreds of billions of dollars very quickly. If one employer doesn’t pay, their competitor will and have a much more productive workforce. This would universally suck for everyone but the models, because we would essentially have to start paying our slaves.
> There is some force (potentially talent poaching, rumor mills, or reverse engineering) which has so far neutralized any runaway advantages a single lab might have had.
Maybe this is due to the same reason why we don't see the same researchers making successive critical breakthroughs, but rather see them emerge from many different labs? Scientific progress requires performing a variety of different experiments. Even though the lab with the latest important breakthrough will be ahead on performing more experiments, that number is usually not going to be enough to lead to the next breakthrough as well. Meanwhile, other labs can start experimenting in different directions which might be more fruitful, especially since the most obvious ideas would already be explored by the first leading group.
"Sometimes people will say that the reason that AIs aren’t more widely deployed across firms and already providing lots of value (outside of coding) is that technology takes a long time to diffuse. I think this is cope. People are using this cope to gloss over the fact that these models just lack the capabilities necessary for broad economic value."
The context of that first statement is important. I say something like that myself all the time. I however am not using it to suggest AGI is here, and just takes a while to diffuse. That would be cope, yes, AGI wouldn't need help diffusing, it would only need permission (we hope).
That said, what I'm trying to convey by that is, even if models froze in time at today's capabilities, usage would broaden dramatically over the next 5-10 years. This is still an important point, even more so if you have only marginal hopes of AGI in that time period. AGI might make all that diffusion a historical artifact, irrelevant except for the value it built in it's limited time window. But without AGI, it all retains relevance, even with ongoing incremental model improvements that don't amount to AGI.
In that world of less than AGI improvement, in 2030 which provides more value; the diffusion and integration based on 2025 model capabilities, or the effect of the new 2030 model capabilities? Keep in mind that these are a Venn:
1. Value created by diffusion & integration accomplished between 2026-2029, not dependent on 2030 model capabilities.
2. Value created by 2030 model capabilities that would be realized at 2025 levels of diffusion & integration.
3. Value created by applying 2030 model capabilities to the diffusion & integration accomplished between 2026-2029.
Barring AGI, I think both 1 and 3 each are probably larger than 2, so we barely need debate the degree of 3 which should be attributed to model improvements or diffusion.