Related to a point made in your podcast with Ilya: it seems like one of the things that allows humans to learn quickly is that the space of misunderstandings humans have is heavily constrained and largely predictable. For example, when learning calculus most pitfalls/confusions are very common and can thus be called out when teaching someone. The mistakes that AIs make are unpredictable (same AI makes different mistakes at different points) but also unintuitive (we don't have a good model for when an AI will be reliable and when it won't). This makes creating a learning environment where all the possible mistakes are not only identified but also penalized correctly incredibly difficult.
This of course relates to your broader point about continual learning. If we could create a model architecture that constrains the AI to fail in predictable ways this seems like it would be a large step towards continual learning.
Good post, but I think you could be overconfident. My sense is that the reports you cite only weakly support the strong claims made, and could also be interpreted in other ways.
> OpenAI is using tons of highly specialized skills in their RL-training pipeline, showing that RL training does not really generalize.
The article cited really only says that OAI hired some Wall Street folks to generate data. I think it is more likely that OAI wants to use this to offer specialized models to high paying customers in the very short term rather than their general approach to reaching AGI. Counterevidence would be OAI acquiring such data from a much more diverse section of the economy.
> AI is not diffused yet, showing we are not at AGI.
True, but the more reasonable folks with short timelines are not saying that we are at AGI already. Slow diffusion is a valid argument if you have agents that are good but not reliable enough to match human performance. Claude Code is by many accounts really useful, but would be useless as an autonomous employee.
Now observe that CC unlocks models' value: using Claude's chat interface for coding would substantially reduce the value add, and that it took serious engineering effort to make CC as good as it is now. If CC and other coding agents did not exist, you would be making the mistaken argument that frontier models are useless for coding. It is quite conceivable that model's value add for many other economically valuable tasks are at the moment bottlenecked on somebody investing serious resources into making such scaffolding,
I read your argument as: the mid-training regime (pre-baking skills via RL environments) is evidence we're far from real continual learning. But what if it's actually the mechanism that gets us there?
If meta-learning emerges from exposure to enough diverse training environments, then the schleppy pre-baking isn't a dead end. It's data toward learning how to learn. We might just be early.
The question I'd want answered: do models trained on more mid-training environments pick up novel tasks faster at deployment, without further training? If yes and it scales, we're on a path. If no, you're right and we need something architecturally different.
Either way, probably compatible with your timelines. "Early" could easily mean another 5-10 years.
> RL scaling is laundering the prestige of pretraining scaling
> With pretraining, we had this extremely clean and general trend in improvement in loss across multiple orders of magnitude of compute (albeit on a power law, which is as weak as exponential growth is strong).
I'd correct it to "we have this extremely clean and general trend in improvement in loss across multiple orders of magnitude of compute (albeit on a power law, which is as weak as exponential growth is strong).
This is because the flywheel of compute and data for pre-training is still going strong, and pre-training has delivered about the expected returns that we used to get.
But agree that current RL scales worse than pre-training, and the big reason RLVR currently is as good as it is is because in coding/tasks that are easy to verify, it's surprisingly easy to get big wins with less absolute compute, but yes this free lunch will end and RLVR will be a more minor factor after 2026-2027.
That said, post 2028-2030, yeah it's likely that political headwinds/investor sentiment will be bad and pre-training progress slows down enough to prevent LLMs from scaling to AGI.
> There’ll also probably be diminishing returns from learning-from-deployment. Each of the first 1000 consultant agents are each learning a ton from deployment. Less so the next 1000. And is there such a long tail to consultant work that the millionth deployed instance is likely to see something super important the other 999,999 instances missed? In fact, I wouldn’t be surprised if continual learning also ends up leading to a power law, but with respect to the number of instances deployed.
For what it's worth, I agree that there might not be a super-long tail to consulting, but one of the lessons that we should learn in the era of AI progress is that lots of jobs have pretty long-tails, and there's a big difference between being able to fully replace the human and being a complement to the human.
So for jobs like AI research or even physical jobs, I do expect very long/heavy tailed impact, meaning that even a billionth instance/copy of a researcher will still see important things that the 999,999,999 other copies didn't.
> Besides, I just have some prior that competition will stay fierce, informed by the observation that all these previous supposed flywheels (user engagement on chat, synthetic data, etc) have done very little to diminish the greater and greater competition between model companies. Every month (or less), the big three will rotate around the podium, with other competitors not that far behind. There is some force (potentially talent poaching, rumor mills, or reverse engineering) which has so far neutralized any runaway advantages a single lab might have had.
A big part of that is algorithmic progress has so far been data progress that's been available to everyone, and since compute rises all boats, we have so far been in the regime where advances are much broader and don't compound to individual labs.
A key question here is whether this changes once AIs are as good as the median human AI researcher at doing algorithmic progress.
> “Solving” continual learning won’t be a singular one-and-done achievement. Instead, it will feel like solving in context learning. GPT-3 demonstrated that in context learning could be very powerful (its ICL capabilities were so remarkable that the title of the GPT-3 paper is ‘Language Models are Few-Shot Learners’). But of course, we didn’t “solve” in-context learning when GPT-3 came out - and indeed there’s plenty of progress still to be made, from comprehension to context length. I expect a similar progression with continual learning. Labs will probably release something next year which they call continual learning, and which will in fact count as progress towards continual learning. But human level continual learning may take another 5 to 10 years of further progress.
> This is why I don’t expect some kind of runaway gains to the first model that cracks continual learning, thus getting more and more widely deployed and capable. If you had fully solved continual learning drop out of nowhere, then sure, it’s “game set match”, as Satya put it. But that’s not what’s going to happen. Instead, some lab is going to figure out how to get some initial traction on the problem. Playing around with this feature will make it clear how it was implemented, and the other labs will soon replicate this breakthrough and improve it slightly.
I largely agree with this, with the caveat that achieving 100% of automation of a field as complicated as AI research (here, the standard is being able to replace a median human AI researcher, or even the best human researchers at all tasks) is likely massively more valuable than automating 90-99% of the job, meaning that in practice a threshold will exist for how good continual learning has to be:
> People have spent a lot of time talking about a software only singularity (where AI models write the code for a smarter successor system), a software + hardware singularity (where AIs also improve their successor’s computing hardware), or variations therein.
> All these scenarios neglect what I think will be the main driver of further improvements atop AGI: continual learning. Again, think about how humans become more capable at anything. It’s mostly from experience in the relevant domain.
I agree with this, though at a civilizational level, population growth matters a lot, and the main claim of the software/hardware singularities is we can increase the effective AI population fast enough to increase progress on a lot of fronts, and more speculatively decrease the amount of compute and data needed to do as well as a human or better via algorithmic improvements, which I actually agree with in generality, though I disagree currently with a lot of the specific scenarios made.
> The agents themselves could be quite specialized - containing what Karpathy called “the cognitive core” plus knowledge and skills relevant to the job they’re being deployed to do.
I'll just copy a restack note I made, since I already commented on this issue:
"This is the only part of the post that I disagree with, as I generally disagree with the assumption that a small cognitive core is better than the large knowledge base gained from pre-training, all else equal, and one of the takeaways of the bitter lesson that I think will still be justified even in the post-LLM era of RL/neuralese AIs is that large amounts of data/compute is still good, all else equal.
A potential crux is I expect pre-training to plateau at a higher capabilities level than people expect, and I think there are a couple of doublings left until 2031-2033, which would correspond to multiple months on the 50% line, which requires AIs to complete 50% of tasks, and at least 1 day, if not several days-weeks on the 80% line, which is close limit before you run into issues of noise and impossible benchmark problems that means 80% will be closer to 90-99%."
An underrated scenario for what this would look like is in Gordon Seidoh Worley's blog, Uncertain Updates about the Autofac era, which is plausible conditional on people not resisting automation very much because of UBI.
Related to a point made in your podcast with Ilya: it seems like one of the things that allows humans to learn quickly is that the space of misunderstandings humans have is heavily constrained and largely predictable. For example, when learning calculus most pitfalls/confusions are very common and can thus be called out when teaching someone. The mistakes that AIs make are unpredictable (same AI makes different mistakes at different points) but also unintuitive (we don't have a good model for when an AI will be reliable and when it won't). This makes creating a learning environment where all the possible mistakes are not only identified but also penalized correctly incredibly difficult.
This of course relates to your broader point about continual learning. If we could create a model architecture that constrains the AI to fail in predictable ways this seems like it would be a large step towards continual learning.
Good post, but I think you could be overconfident. My sense is that the reports you cite only weakly support the strong claims made, and could also be interpreted in other ways.
> OpenAI is using tons of highly specialized skills in their RL-training pipeline, showing that RL training does not really generalize.
The article cited really only says that OAI hired some Wall Street folks to generate data. I think it is more likely that OAI wants to use this to offer specialized models to high paying customers in the very short term rather than their general approach to reaching AGI. Counterevidence would be OAI acquiring such data from a much more diverse section of the economy.
> AI is not diffused yet, showing we are not at AGI.
True, but the more reasonable folks with short timelines are not saying that we are at AGI already. Slow diffusion is a valid argument if you have agents that are good but not reliable enough to match human performance. Claude Code is by many accounts really useful, but would be useless as an autonomous employee.
Now observe that CC unlocks models' value: using Claude's chat interface for coding would substantially reduce the value add, and that it took serious engineering effort to make CC as good as it is now. If CC and other coding agents did not exist, you would be making the mistaken argument that frontier models are useless for coding. It is quite conceivable that model's value add for many other economically valuable tasks are at the moment bottlenecked on somebody investing serious resources into making such scaffolding,
Good post, very thought-provoking. Thanks D 💚 🥃
Great post.
I read your argument as: the mid-training regime (pre-baking skills via RL environments) is evidence we're far from real continual learning. But what if it's actually the mechanism that gets us there?
If meta-learning emerges from exposure to enough diverse training environments, then the schleppy pre-baking isn't a dead end. It's data toward learning how to learn. We might just be early.
The question I'd want answered: do models trained on more mid-training environments pick up novel tasks faster at deployment, without further training? If yes and it scales, we're on a path. If no, you're right and we need something architecturally different.
Either way, probably compatible with your timelines. "Early" could easily mean another 5-10 years.
Writing comments on some elements of this post:
> RL scaling is laundering the prestige of pretraining scaling
> With pretraining, we had this extremely clean and general trend in improvement in loss across multiple orders of magnitude of compute (albeit on a power law, which is as weak as exponential growth is strong).
I'd correct it to "we have this extremely clean and general trend in improvement in loss across multiple orders of magnitude of compute (albeit on a power law, which is as weak as exponential growth is strong).
This is because the flywheel of compute and data for pre-training is still going strong, and pre-training has delivered about the expected returns that we used to get.
But agree that current RL scales worse than pre-training, and the big reason RLVR currently is as good as it is is because in coding/tasks that are easy to verify, it's surprisingly easy to get big wins with less absolute compute, but yes this free lunch will end and RLVR will be a more minor factor after 2026-2027.
That said, post 2028-2030, yeah it's likely that political headwinds/investor sentiment will be bad and pre-training progress slows down enough to prevent LLMs from scaling to AGI.
> There’ll also probably be diminishing returns from learning-from-deployment. Each of the first 1000 consultant agents are each learning a ton from deployment. Less so the next 1000. And is there such a long tail to consultant work that the millionth deployed instance is likely to see something super important the other 999,999 instances missed? In fact, I wouldn’t be surprised if continual learning also ends up leading to a power law, but with respect to the number of instances deployed.
For what it's worth, I agree that there might not be a super-long tail to consulting, but one of the lessons that we should learn in the era of AI progress is that lots of jobs have pretty long-tails, and there's a big difference between being able to fully replace the human and being a complement to the human.
So for jobs like AI research or even physical jobs, I do expect very long/heavy tailed impact, meaning that even a billionth instance/copy of a researcher will still see important things that the 999,999,999 other copies didn't.
> Besides, I just have some prior that competition will stay fierce, informed by the observation that all these previous supposed flywheels (user engagement on chat, synthetic data, etc) have done very little to diminish the greater and greater competition between model companies. Every month (or less), the big three will rotate around the podium, with other competitors not that far behind. There is some force (potentially talent poaching, rumor mills, or reverse engineering) which has so far neutralized any runaway advantages a single lab might have had.
A big part of that is algorithmic progress has so far been data progress that's been available to everyone, and since compute rises all boats, we have so far been in the regime where advances are much broader and don't compound to individual labs.
A key question here is whether this changes once AIs are as good as the median human AI researcher at doing algorithmic progress.
> “Solving” continual learning won’t be a singular one-and-done achievement. Instead, it will feel like solving in context learning. GPT-3 demonstrated that in context learning could be very powerful (its ICL capabilities were so remarkable that the title of the GPT-3 paper is ‘Language Models are Few-Shot Learners’). But of course, we didn’t “solve” in-context learning when GPT-3 came out - and indeed there’s plenty of progress still to be made, from comprehension to context length. I expect a similar progression with continual learning. Labs will probably release something next year which they call continual learning, and which will in fact count as progress towards continual learning. But human level continual learning may take another 5 to 10 years of further progress.
> This is why I don’t expect some kind of runaway gains to the first model that cracks continual learning, thus getting more and more widely deployed and capable. If you had fully solved continual learning drop out of nowhere, then sure, it’s “game set match”, as Satya put it. But that’s not what’s going to happen. Instead, some lab is going to figure out how to get some initial traction on the problem. Playing around with this feature will make it clear how it was implemented, and the other labs will soon replicate this breakthrough and improve it slightly.
I largely agree with this, with the caveat that achieving 100% of automation of a field as complicated as AI research (here, the standard is being able to replace a median human AI researcher, or even the best human researchers at all tasks) is likely massively more valuable than automating 90-99% of the job, meaning that in practice a threshold will exist for how good continual learning has to be:
https://www.lesswrong.com/posts/Nbcs5Fe2cxQuzje4K/value-of-the-long-tail
> People have spent a lot of time talking about a software only singularity (where AI models write the code for a smarter successor system), a software + hardware singularity (where AIs also improve their successor’s computing hardware), or variations therein.
> All these scenarios neglect what I think will be the main driver of further improvements atop AGI: continual learning. Again, think about how humans become more capable at anything. It’s mostly from experience in the relevant domain.
I agree with this, though at a civilizational level, population growth matters a lot, and the main claim of the software/hardware singularities is we can increase the effective AI population fast enough to increase progress on a lot of fronts, and more speculatively decrease the amount of compute and data needed to do as well as a human or better via algorithmic improvements, which I actually agree with in generality, though I disagree currently with a lot of the specific scenarios made.
> The agents themselves could be quite specialized - containing what Karpathy called “the cognitive core” plus knowledge and skills relevant to the job they’re being deployed to do.
I'll just copy a restack note I made, since I already commented on this issue:
"This is the only part of the post that I disagree with, as I generally disagree with the assumption that a small cognitive core is better than the large knowledge base gained from pre-training, all else equal, and one of the takeaways of the bitter lesson that I think will still be justified even in the post-LLM era of RL/neuralese AIs is that large amounts of data/compute is still good, all else equal.
A potential crux is I expect pre-training to plateau at a higher capabilities level than people expect, and I think there are a couple of doublings left until 2031-2033, which would correspond to multiple months on the 50% line, which requires AIs to complete 50% of tasks, and at least 1 day, if not several days-weeks on the 80% line, which is close limit before you run into issues of noise and impossible benchmark problems that means 80% will be closer to 90-99%."
An underrated scenario for what this would look like is in Gordon Seidoh Worley's blog, Uncertain Updates about the Autofac era, which is plausible conditional on people not resisting automation very much because of UBI.
https://substack.com/home/post/p-174462433