Great post! This is basically how I think about things as well. So why the difference in our timelines then?
--Well, actually, they aren't that different. My median for the intelligence explosion is 2028 now (one year longer than it was when writing AI 2027), which means early 2028 or so for the superhuman coder milestone described in AI 2027, which I'd think roughly corresponds to the "can do taxes end-to-end" milestone you describe as happening by end of 2028 with 50% probability. Maybe that's a little too rough; maybe it's more like month-long horizons instead of week-long. But at the growth rates in horizon lengths that we are seeing and that I'm expecting, that's less than a year...
--So basically it seems like our only serious disagreement is the continual/online learning thing, which you say 50% by 2032 on whereas I'm at 50% by end of 2028. Here, my argument is simple: I think that once you get to the superhuman coder milestone, the pace of algorithmic progress will accelerate, and then you'll reach full AI R&D automation and it'll accelerate further, etc. Basically I think that progress will be much faster than normal around that time, and so innovations like flexible online learning that feel intuitively like they might come in 2032 will instead come later that same year.
(For reference AI 2027 depicts a gradual transition from today to fully online learning, where the intermediate stages look something like "Every week, and then eventually every day, they stack on another fine-tuning run on additional data, including an increasingly high amount of on-the-job real world data." A janky unprincipled solution in early 2027 that gives way to more elegant and effective things midway through the year.)
Is this too optimistic about contextual learning and deployment? For example, can we reach full R&D automation for self-driving vehicles, self-driving construction trucks simply through code + synthetic data? Those are areas where actual data would be very sparse and difficult to get into a good enough mode for training.
I spend a lot of time driving through construction zones, which I take as emblematic of most economic work, even AI research, and it makes me more pessimistic about AI ability to grok context. In a construction zone, I see so many little nuances that I am unsure how to train into a model.
Take o3 and try to use it to take a chapter of a Latin textbook that is prepping you to read Caeser and rewrite the chapter to prep you to read Pliny instead. It's interesting to me, at least, that it gets lost in the task and doesn't understand the reasons the textbook is laid out the way it is and then fails to replicate that, even with instructions to, even though it is right there. It is confused about things like how to scaffold, graduated repetition, familiar vs unfamiliar vocabulary, what needs to be glossed, what's grammatically confusing to learners and why. Yes, these are trainable, but only specifically and across many thousands of domains. Reality still has more detail than we give it credit for.
Sometimes I think we do not understand or have forgotten how the economy outside of SV works. And the economy outside of SV is an input to SV, as well as what SV interacts with to provide value.
I agree with much of this post. I also have roughly 2032 medians to things going crazy, I agree learning on the job is very useful, and I'm also skeptical we'd see massive white collar automation without further AI progress.
However, I think Dwarkesh is wrong to suggest that RL fine-tuning can't be qualitatively similar to how humans learn.
In the post, he discusses AIs constructing verifiable RL environments for themselves based on human feedback and then argues this wouldn't be flexible and powerful enough to work, but RL could be used more similarly to how humans learn.
My best guess is that the way humans learn on the job is mostly by noticing when something went well (or poorly) and then sample efficiently updating (with their brain doing something analogous to an RL update). In some cases, this is based on external feedback (e.g. from a coworker) and in some cases it's based on self-verification: the person just looking at the outcome of their actions and then determining if it went well or poorly.
So, you could imagine RL'ing an AI based on both external feedback and self-verification like this. And, this would be a "deliberate, adaptive process" like human learning. Why would this currently work worse than human learning?
Current AIs are worse than humans at two things which makes RL (quantitatively) much worse for them:
1. Robust self-verification: the ability to correctly determine when you've done something well/poorly in a way which is robust to you optimizing against it.
2. Sample efficiency: how much you learn from each update (potentially leveraging stuff like determining what caused things to go well/poorly which humans certainly take advantage of). This is especially important if you have sparse external feedback.
But, these are more like quantitative than qualitative issues IMO. AIs (and RL methods) are improving at both of these.
All that said, I think it's very plausible that the route to better continual learning routes more through building on in-context learning (perhaps through something like neuralese, though this would greatly increase misalignment risks...).
Some more quibbles:
- For the exact podcasting tasks Dwarkesh mentions, it really seems like simple fine-tuning mixed with a bit of RL would solve his problem. So, an automated training loop run by the AI could probably work here. This just isn't deployed as an easy-to-use feature.
- For many (IMO most) useful tasks, AIs are limited by something other than "learning on the job". At autonomous software engineering, they fail to match humans with 3 hours of time and they are typically limited by being bad agents or by being generally dumb/confused. To be clear, it seems totally plausible that for podcasting tasks Dwarkesh mentions, learning is the limiting factor.
- Correspondingly, I'd guess the reason that we don't see people trying more complex RL based continual learning in normal deployments is that there is lower hanging fruit elsewhere and typically something else is the main blocker. I agree that if you had human level sample efficiency in learning this would immediately yield strong results (e.g., you'd have very superhuman AIs with 10^26 FLOP presumably), I'm just making a claim about more incremental progress.
- I think Dwarkesh uses the term "intelligence" somewhat atypically when he says "The reason humans are so useful is not mainly their raw intelligence. It's their ability to build up context, interrogate their own failures, and pick up small improvements and efficiencies as they practice a task." I think people often consider how fast someone learns on the job as one aspect of intelligence. I agree there is a difference between short feedback loop intelligence (e.g. IQ tests) and long feedback loop intelligence and they are quite correlated in humans (while AIs tend to be relatively worse at long feedback loop intelligence).
- Dwarkesh notes "An AI that is capable of online learning might functionally become a superintelligence quite rapidly, even if there's no algorithmic progress after that point." This seems reasonable, but it's worth noting that if sample efficient learning is very compute expensive, then this might not happen so rapidly.
- I think AIs will likely overcome poor sample efficiency to achieve a very high level of performance using a bunch of tricks (e.g. constructing a bunch of RL environments, using a ton of compute to learn when feedback is scarce, learning from much more data than humans due to "learn once deploy many" style strategies). I think we'll probably see fully automated AI R&D prior to matching top human sample efficiency at learning on the job. Notably, if you do match top human sample efficiency at learning (while still using a similar amount of compute to the human brain), then we already have enough compute for this to basically immediately result in vastly superhuman AIs (human lifetime compute is maybe 3e23 FLOP and we'll soon be doing 1e27 FLOP training runs). So, either sample efficiency must be worse or at least it must not be possible to match human sample efficiency without spending more compute per data-point/trajectory/episode.
Given the risk of fines and jail for filling your taxes wrong, and the cost of processing poor quality paperwork that the government will have to bear, it seems very unlikely that people will want AI to do taxes, and very unlikely that a government will allow AI to do taxes.
That sort of happens already, but not quite, as I understand it. Accountants act as agents to file taxes for individuals all the time. If it's done wrong, the individual remains liable for taxes, interest and additional charges if they didn't have a "reasonable excuse" or didn't take "reasonable care" (e.g. they didn't use an ACCA qualified firm). You only have recourse to sue the accountants, not the taxman. Accountants take out insurance to cover this. That's close to what you're saying but it's worth pointing out that insurance isn't actually a form of arbitrage.
HMRC have decided that giving all the correct paperwork to a 3rd party qualified accountant sometimes counts as a "reasonable excuse", and might decide to waive the penalty (but not the interest or obviously the tax itself). Will they decide that using a non-accountancy AI firm is a "reasonable excuse"? Take your bets....
I think as a practical matter it's very difficult for the government to stop an AI from doing your taxes. You can self-file your prepared return, and how do they know that you had GPT6 do it for you?
I also think you're probably right that people are fairly risk averse about this, but the reality is that the vast majority of people actually have very simple taxes, and given that so many of the simple personal taxes look basically the same but with different numbers, I strongly expect it to be within the capabilities of any reasonable future agent. The complicated business tax arrangement Dwarkesh discusses (receipts, going back and forth with suppliers, etc.) seems like it's further away, but it doesn't actually require any unthinkable skillsets.
They will probably be able to guess that AI did it because "GPT6" - a codeword here by which you mean an AI that doesn't make mistakes? - doesn't exist; meanwhile, a GPT o3 or o4-based solution - models that exist now - will almost certainly make mistakes. It all just never seems to work quite as well as when Altman demos it, does it?
The picture may be different in the US but in the UK, the vast majority do not need to do tax returns at all, it's PAYE. That's simple. If you need to do self-assessment here, you are automatically starting in a place where it's more complicated, hence room for error.
Given the recent controversy over the Loan Charge (retrospective demands for tax that it had miscalculated itself, in one lump sum subject automatically to higher rates regardless of an individual's history), HMRC cannot be trusted to act rationally or reasonably over tax mistakes.
Actually, I think the biggest risk of mistakes is the missed opportunity on behalf of the user to properly reduce their tax, and likely submitting to pay too much (by missing some obscure thing about pension-child-tax-rebate-investment-credits or whatever the latest bollocks is). So you'd want a qualified human you can reasonably trust to act in your interests if you wanted a 3rd party to do your taxes.
Or, that's what I'd want, anyway. Feel free to use "GPT6" yourself, though
I mean, yes, I am talking about future models, hence my reference to "reasonable future models" i.e. things somewhat better than what exist now but not monumentally so.
I think your comment and mine also just diverge because the US tax picture is indeed very different. Every individual must have tax returns submitted on their behalf, and almost all upper middle class people (who do not qualify for free tax filing software) pay for software to do taxes that amount to punching in the right numbers in the right boxes and seeing what comes up. This is where I see a lot of adoption in the near future. Why spend potentially over $100 preparing my taxes when following a deterministic flowchart and filling in a form that looks like millions of identical forms it's already been trained on seems like a straightforward task well suited to LLMs?
Predictions that require genuine breakthroughs should be taken with a large grain of salt. Yes, we can say more smart people are working on the problem of continuous learning than ever before and that this number will increase. We can also say that it doesn't seem like it should be that hard. But if it actually just is a really tough problem requiring new thinking and new architecture, it could be decades.
there are some arguments to be made continual learning can be solved with current theoretical paradigms. I believe this is why the AI companies working on this hype it all up so much.
We don't know if they'll be correct, but there certainly are some arguments here that you can just 'scale' up a bunch of stuff and it just works.
It seems to me that the success in some fairly narrow domains has excited people about application of LLMs to much broader applications, without them stopping to ask why LLMs have these particular strengths in the first place.
I've noticed that LLMs excel at a the following tasks: text parsing and summary, solving canned problems.
Text parsing/summary plays on their abilities to read and "understand" large amounts of text. This shows up as them being useful as a search engine, summarizing a book, or rephrasing ideas in different language to help understand them.
Solving canned problems takes advantage of their vast training data, as they've probably encountered the problem before. This is especially true of "textbook problems" that make up most homework assignments and why LLMs are so good at helping people cheat. This is also where their amazing ability to write code comes from, especially simple code.
Beyond that, I've had mostly disappointment with their abilities. Presented with novel problems, or problems that don't really have solutions, they tend to flounder a bit.
But still, these are amazing achievements and I use LLMs so much every day! But I am skeptical that training harder and smarter will enable these problems to be breached and result in anything resembling ASI.
Fully agree continuous learning is a critically necessary missing piece, but pathways to cracking it seem both straightforward and likely-to-be-cracked given all the labs are prominently working on them? https://x.com/RobDearborn/status/1928287465694957875
Excellent post, as always! Your point about continual learning being a bottleneck resonates deeply with my experience building AI systems. Let me build on that insight by exploring four related challenges that I believe will prove equally thorny.
The first challenge I'd call the "telephone game problem" in multi-agent systems. When I watch information pass through chains of AI agents, I see systematic degradation that goes beyond simple errors. It's like that childhood game where you whisper a message around a circle, except now some players aren't human and miss the subtle contextual cues that would normally preserve meaning. Each handoff compounds the problem. Humans intuitively understand that the same phrase means different things when spoken by different people in different contexts, but current AI agents struggle with this nuanced interpretation.
This connects to what I think of as the "penguin-robin problem" - a conceptual granularity issue that Yann LeCun has been exploring. Large language models treat penguins and robins as equally "bird-like," while humans immediately recognize robins as more prototypical birds. This might seem like a minor classification issue, but it creates reasoning errors that compound dramatically when AI agents attempt longer-horizon tasks or try to integrate into existing human teams.
Perhaps most challenging is what we might call the "invisible knowledge problem." When our UX designer recently left, he took with him over 1,000 hours of conversations, shared mental models, and undocumented team insights that no training data could ever capture. His human replacement will need 6-12 months to reach equivalent productivity. This pattern repeats across skilled roles - enterprise salespeople often require 12-24 months to reach full effectiveness in new companies, and they're already experts at sales. The challenge of onboarding an AI "teammate" into this web of tacit knowledge seems even more daunting.
Finally, there's the trust and responsibility gap. Humans accept accountability for their decisions in ways that create both legal and cultural frameworks for collaboration. Moving AI beyond a co-pilot role requires solving not just technical problems, but social ones around responsibility, especially in high-stakes environments.
These challenges suggest AI will likely progress through three distinct phases: becoming better co-pilots across more domains (where we're seeing remarkable progress), evolving into trustworthy independent workers for isolated tasks, and eventually becoming full teammates.
Each transition requires solving progressively harder social and intelligence problems.
I've explored these ideas in more detail in a couple of posts if you're interested in diving deeper:
The invisible knowledge problem is also where the real payoff is. If AIs can start to understand and use some of what is know invisible knowledge their value increases exponentially. oubly so since they can't quit, and could theoretically keep improving.
Okay, so after sleeping on this — agree with Dwarkesh that learning is a big bottleneck — and I wanted to really reflect on the "why is learning so hard" (or might it be so hard)... so, working with AI tools I drafted a little short "booklet" going back to my developmental psychology grad school roots of how humans learn vs. how AI learns and what are the open / unsolved challenges here that I see.
The key insight: we'll get impressive AI capabilities in narrow domains soon, but the deeper challenges of genuine curiosity, embodied understanding, and organic learning may take decades. We're heading toward (more and more) capable but fundamentally limited AI partners IMO.
This was fun / interesting to draft. Warning: It's very long.
I have worked in IT in midsized government organizations for most of my life and my main combined hope and worry is, that what we are looking at is really that AI will take over mid level management, that is small and mid complexity project management as well as the management layers between xEO leve l management and the hands-on employees.
Essentially a web of auto updating spreadsheets and Gantt diagrams with some capacity for more advanced replanning, when called for.
I hope, because I frequently wish for smarter/superhuman abilities in day-to-day task management and the way LLMs work seem to match that type of work rather well.
I fear, because once they are in place, the inherent cynicism in (project) management frameworks will be playing to AI’s good side, whereas making team efforts come together by inspiration and leadership will be probably be playing to the bad (truth-agnostic) side, and that probably scales and reiterates badly.
To put it bluntly: We will lose the middle class buffer zone in larger organisations, and essentially turn the bell curve upside down, emphasizing drasticly the already worrying polarization of the general society between those who master the AIs and those who are the limbs af the AIs
Why would I want ChatGPT to go through my email? What an insane privacy violation for all the people I exchanged email with, who had an expectation of confidentiality - at least an implicit one.
What if the agent decides I broke the law somewhere in all the email it combs through? Will it notify the authorities? Does it have an obligation to notify the authorities?
Unless we start creating business only accounts with privacy disclaimers on all of our correspondence, this is going to take a lot longer than you imagine.
Excellent post. I think you are spot-on with the diagnosis, and are quite close on what the solution will look like -- all but dancing around it. The main claim I disagree with is "...there’s no obvious way to slot in online, continuous learning into the kinds of models these LLMs are." So let me try to convince you that there *is* one obvious way.
Human-like "continual online learning" can be found in current-day LLMs in the form of *in-context learning*. If you prompt an LLM with a few examples of how to solve (or how *not* to solve) a task, it will meaningfully improve its ability to solve it going forwards. This is exactly the effect you were gesturing at with your paragraph on how "LLMs actually do get kinda smart and useful in the middle of a session". A human-on-the-job can be understood to be learning using the same mechanism, but the entire lifetime of a human is *just one session*: the employee is receiving example after example after example, and improving each time.
The approach you propose, "a long rolling context window...compacting the session memory [into text]" is also quite close to the right approach, but falls short, largely for the reasons you describe: brittleness, terrible in some domains, etc. More broadly, a major takeaway from the arc of deep learning over the past decade is that all truly successful models are end-to-end, because gradient descent loves end-to-end and that is what allows us to scale. Any real solution must rely on huge vectors of real numbers, not brittle and tiny text summaries.
The correct solution is to use the context directly. No tricks, no hacks, no text intermediates; just place a long sequence of tokens in the context. The lifetime of an agent is one long session, where we let the model leverage in-context learning to improve.
Unfortunately, there are three issues with my solution. Firstly: the context lengths available for current LLMs are far too *short*. A million tokens sounds like a lot, but if you were to put every token seen by a software engineer across their career into a single session, you're easily looking at a context six orders of magnitude larger. Secondly: using long contexts is far too *expensive*. The cost-per-token of transformer inference grows with the amount of context used to generate that token, meaning that even if we did give a transformer a trillion-token-software-engineer context, it would be absurdly (prohibitively?) expensive to generate code with it. Thirdly, and in some ways most damningly: adding more tokens to the context *does not help*. The first few examples help a lot, but the improvement quickly tapers. Current LLMs are simply not capable of effectively utilizing ultra-long contexts (marketing-motivated claims to the contrary notwithstanding).
These issues are solvable. Not *easily* solvable -- but solvable. There's nothing fundamentally or paradigmatically wrong with the idea that we should be able to get better in-context learning than we currently get. We just need better scaling laws, meaning better architectures and better algorithms. I've been in the weeds on this problem for almost three years, and we've made a lot of progress both on understanding the best way to think about the problem and on discovering technical (architectural/algorithmic) ideas that begin to approach a solution. But it is far from solved, and ultimately I do more or less agree with your overall take on timelines.
Jann LeCun points out a 4 yr old child has harvested the same amount of data through its optic nerve since birth as all the data today's most powerful LLM is trained on. This is just visual data, it excludes proprioceptive, smell, taste, deep touch, light touch, temperature, etc and all the hormonal data flowing through our physical bodies. An expert human is just so much vastly more data rich and process powerful that an LLM could possibly be with current technology or even near future technology. A LLM needs to be exposed to vast amounts of this physical data to be able to approximate AGI, and to have the processing power, and to have the algorithmic depth and breadth, and be able to employ heuristics. I predict we are a long way off LLM AGI unless there is some profound tech breakthrough. But simply scaling current systems wont get us there.
"AI can do taxes end-to-end for my small business as well as a competent general manager could in a week: including chasing down all the receipts on different websites, finding all the missing pieces, emailing back and forth with anyone we need to hassle for invoices, filling out the form, and sending it to the IRS: 2028"
That feels like a bigger jump from present capabilities (in terms of judgement and error correction) than the jump from GPT-4 to today, and GPT-4 finished training in summer 2022. So ... maybe? But even that seems to require further acceleration.
I think 2032 is still too soon to expect AI that can learn on the job like a human. The "Attention" (transformer architecture) paper came out in 2017 - 8 years ago - and while we've seen massive efficiency improvements - different types of attention, KV cache, MOE, etc - we are still - after 8 years of massive research and spending - still using transformers pre-trained with SGD. The most significant "innovation" has perhaps the use of RL-post training, but that was introduced a long time back with RLHF, and is anyways an old technique.
It seems that "on the job learning" will require a shift from SGD to a new learning mechanism, which has been sought for a long time, but will also require other innate mechanisms so that the model not only CAN learn, but also wants to and exposes itself to learning situations (curiosity, boredom, etc), as well as episodic memory of something similar so that the model knows what it knows - remembers learning it (or not) - and therefore doesn't hallucinate.
I really like this post, and appreciate the focus on continual learning, which I think is vastly underrated compared to the other problems AI agents face (I wrote about the continual memory learning problem back in March: https://hardlyworking1.substack.com/p/we-are-in-the-good-timeline-for-ai.
As I see it, the crux of the problem is pretty simple: humans trade away large and capable working memory capacity in exchange for dynamic long-term memory that is adaptable and capable of continuous modification. LLMs make the opposite tradeoff: their long-term memories are entirely static, and attempts to solve agency problems have focused on enhancing short-term memory capacity via expanding context windows, which would be akin to enhancing human working memory. This leaves us with different sets of problems and proficiencies.
A few commenters here have cited RAG (Retrieval-Augmented Generation) as a possible approach for getting to AI agents. RAG requires LLMs to refer back to a specific set of documents before answering user queries, reducing AI hallucinations by combining normal LLM processes with search and retrieval processes. This will likely improve the process as a whole, but this is probably the wrong approach for making self-improving agents.
The reason why is simple; you still haven't solved the dynamic long-term memory problem. Even if your LLM extends its "working memory" so to speak, it is incapable of making precise adjustments to its "worldview" (or base weights) based on feedback, because it lacks the internal truth models that would allow it to understand what it is modifying and how its modifications would affect future behavior. You're enhancing its capability to perform long tasks, but it still fundamentally is not a *learning being* and so is not going to be doing any of the superintelligent X-Risk stuff required for short timelines that include takeoff scenarios.
Furthermore, when humans "hallucinate" (at least systematically), it's often due to faulty learning processes that have overfit on bad paradigms. PTSD, OCD, certain kinds of depression and anxiety—these can all be caused by single or repetitive traumatic effects that have fundamentally altered people's worldviews and leaves them with "trapped priors" that prevent them from properly updating their worldviews and internal truth values in response to new information that invalidates their old worldviews (https://www.lesswrong.com/posts/hNqte2p48nqKux3wS/trapped-priors-as-a-basic-problem-of-rationality). When LLMs "hallucinate", it's because they don't have coherent worldviews or internal truth values AT ALL. People say things like "humans also just do next-token prediction!" but anyone who has thought about psychology at any amount of length knows this is not true. Humans act on heuristics that can be modified in response to input. LLMs simply don't have heuristics.
As long as base code cannot be modified in situ, I don't see true AGI (or ASI for that matter) happening. How long before base code can be modified in situ? I'm not going to forecast a precise guess for timelines, but given that I've never even heard of anyone discussing this problem, I'm going to put 2030 as my very earliest timeline. In all likelihood, I think it'll probably be closer to 2040 or beyond, because finding LLM heuristics and learning how to modify them on the spot to learn in response to feedback is going to take a LOT of mechanistic interpretability work (assuming it's even possible) and mech interp is not exactly the focus of most people working on this problem.
Thank you for this. I found the article readable, compelling, and filled with a nice blend of facts and commentary that will keep me thinking on this for awhile.
Great post! This is basically how I think about things as well. So why the difference in our timelines then?
--Well, actually, they aren't that different. My median for the intelligence explosion is 2028 now (one year longer than it was when writing AI 2027), which means early 2028 or so for the superhuman coder milestone described in AI 2027, which I'd think roughly corresponds to the "can do taxes end-to-end" milestone you describe as happening by end of 2028 with 50% probability. Maybe that's a little too rough; maybe it's more like month-long horizons instead of week-long. But at the growth rates in horizon lengths that we are seeing and that I'm expecting, that's less than a year...
--So basically it seems like our only serious disagreement is the continual/online learning thing, which you say 50% by 2032 on whereas I'm at 50% by end of 2028. Here, my argument is simple: I think that once you get to the superhuman coder milestone, the pace of algorithmic progress will accelerate, and then you'll reach full AI R&D automation and it'll accelerate further, etc. Basically I think that progress will be much faster than normal around that time, and so innovations like flexible online learning that feel intuitively like they might come in 2032 will instead come later that same year.
(For reference AI 2027 depicts a gradual transition from today to fully online learning, where the intermediate stages look something like "Every week, and then eventually every day, they stack on another fine-tuning run on additional data, including an increasingly high amount of on-the-job real world data." A janky unprincipled solution in early 2027 that gives way to more elegant and effective things midway through the year.)
Is this too optimistic about contextual learning and deployment? For example, can we reach full R&D automation for self-driving vehicles, self-driving construction trucks simply through code + synthetic data? Those are areas where actual data would be very sparse and difficult to get into a good enough mode for training.
I spend a lot of time driving through construction zones, which I take as emblematic of most economic work, even AI research, and it makes me more pessimistic about AI ability to grok context. In a construction zone, I see so many little nuances that I am unsure how to train into a model.
Take o3 and try to use it to take a chapter of a Latin textbook that is prepping you to read Caeser and rewrite the chapter to prep you to read Pliny instead. It's interesting to me, at least, that it gets lost in the task and doesn't understand the reasons the textbook is laid out the way it is and then fails to replicate that, even with instructions to, even though it is right there. It is confused about things like how to scaffold, graduated repetition, familiar vs unfamiliar vocabulary, what needs to be glossed, what's grammatically confusing to learners and why. Yes, these are trainable, but only specifically and across many thousands of domains. Reality still has more detail than we give it credit for.
Sometimes I think we do not understand or have forgotten how the economy outside of SV works. And the economy outside of SV is an input to SV, as well as what SV interacts with to provide value.
So my timelines push out another decade.
I agree with much of this post. I also have roughly 2032 medians to things going crazy, I agree learning on the job is very useful, and I'm also skeptical we'd see massive white collar automation without further AI progress.
However, I think Dwarkesh is wrong to suggest that RL fine-tuning can't be qualitatively similar to how humans learn.
In the post, he discusses AIs constructing verifiable RL environments for themselves based on human feedback and then argues this wouldn't be flexible and powerful enough to work, but RL could be used more similarly to how humans learn.
My best guess is that the way humans learn on the job is mostly by noticing when something went well (or poorly) and then sample efficiently updating (with their brain doing something analogous to an RL update). In some cases, this is based on external feedback (e.g. from a coworker) and in some cases it's based on self-verification: the person just looking at the outcome of their actions and then determining if it went well or poorly.
So, you could imagine RL'ing an AI based on both external feedback and self-verification like this. And, this would be a "deliberate, adaptive process" like human learning. Why would this currently work worse than human learning?
Current AIs are worse than humans at two things which makes RL (quantitatively) much worse for them:
1. Robust self-verification: the ability to correctly determine when you've done something well/poorly in a way which is robust to you optimizing against it.
2. Sample efficiency: how much you learn from each update (potentially leveraging stuff like determining what caused things to go well/poorly which humans certainly take advantage of). This is especially important if you have sparse external feedback.
But, these are more like quantitative than qualitative issues IMO. AIs (and RL methods) are improving at both of these.
All that said, I think it's very plausible that the route to better continual learning routes more through building on in-context learning (perhaps through something like neuralese, though this would greatly increase misalignment risks...).
Some more quibbles:
- For the exact podcasting tasks Dwarkesh mentions, it really seems like simple fine-tuning mixed with a bit of RL would solve his problem. So, an automated training loop run by the AI could probably work here. This just isn't deployed as an easy-to-use feature.
- For many (IMO most) useful tasks, AIs are limited by something other than "learning on the job". At autonomous software engineering, they fail to match humans with 3 hours of time and they are typically limited by being bad agents or by being generally dumb/confused. To be clear, it seems totally plausible that for podcasting tasks Dwarkesh mentions, learning is the limiting factor.
- Correspondingly, I'd guess the reason that we don't see people trying more complex RL based continual learning in normal deployments is that there is lower hanging fruit elsewhere and typically something else is the main blocker. I agree that if you had human level sample efficiency in learning this would immediately yield strong results (e.g., you'd have very superhuman AIs with 10^26 FLOP presumably), I'm just making a claim about more incremental progress.
- I think Dwarkesh uses the term "intelligence" somewhat atypically when he says "The reason humans are so useful is not mainly their raw intelligence. It's their ability to build up context, interrogate their own failures, and pick up small improvements and efficiencies as they practice a task." I think people often consider how fast someone learns on the job as one aspect of intelligence. I agree there is a difference between short feedback loop intelligence (e.g. IQ tests) and long feedback loop intelligence and they are quite correlated in humans (while AIs tend to be relatively worse at long feedback loop intelligence).
- Dwarkesh notes "An AI that is capable of online learning might functionally become a superintelligence quite rapidly, even if there's no algorithmic progress after that point." This seems reasonable, but it's worth noting that if sample efficient learning is very compute expensive, then this might not happen so rapidly.
- I think AIs will likely overcome poor sample efficiency to achieve a very high level of performance using a bunch of tricks (e.g. constructing a bunch of RL environments, using a ton of compute to learn when feedback is scarce, learning from much more data than humans due to "learn once deploy many" style strategies). I think we'll probably see fully automated AI R&D prior to matching top human sample efficiency at learning on the job. Notably, if you do match top human sample efficiency at learning (while still using a similar amount of compute to the human brain), then we already have enough compute for this to basically immediately result in vastly superhuman AIs (human lifetime compute is maybe 3e23 FLOP and we'll soon be doing 1e27 FLOP training runs). So, either sample efficiency must be worse or at least it must not be possible to match human sample efficiency without spending more compute per data-point/trajectory/episode.
(I originally posted this on twitter (https://x.com/RyanPGreenblatt/status/1929757554919592008), but thought it might be useful to put here too.)
Given the risk of fines and jail for filling your taxes wrong, and the cost of processing poor quality paperwork that the government will have to bear, it seems very unlikely that people will want AI to do taxes, and very unlikely that a government will allow AI to do taxes.
Arbitrage possibility -- if AI can get this correct 99% of the time, sell a service that does it and also insures you against mistakes.
That sort of happens already, but not quite, as I understand it. Accountants act as agents to file taxes for individuals all the time. If it's done wrong, the individual remains liable for taxes, interest and additional charges if they didn't have a "reasonable excuse" or didn't take "reasonable care" (e.g. they didn't use an ACCA qualified firm). You only have recourse to sue the accountants, not the taxman. Accountants take out insurance to cover this. That's close to what you're saying but it's worth pointing out that insurance isn't actually a form of arbitrage.
HMRC have decided that giving all the correct paperwork to a 3rd party qualified accountant sometimes counts as a "reasonable excuse", and might decide to waive the penalty (but not the interest or obviously the tax itself). Will they decide that using a non-accountancy AI firm is a "reasonable excuse"? Take your bets....
I think as a practical matter it's very difficult for the government to stop an AI from doing your taxes. You can self-file your prepared return, and how do they know that you had GPT6 do it for you?
I also think you're probably right that people are fairly risk averse about this, but the reality is that the vast majority of people actually have very simple taxes, and given that so many of the simple personal taxes look basically the same but with different numbers, I strongly expect it to be within the capabilities of any reasonable future agent. The complicated business tax arrangement Dwarkesh discusses (receipts, going back and forth with suppliers, etc.) seems like it's further away, but it doesn't actually require any unthinkable skillsets.
They will probably be able to guess that AI did it because "GPT6" - a codeword here by which you mean an AI that doesn't make mistakes? - doesn't exist; meanwhile, a GPT o3 or o4-based solution - models that exist now - will almost certainly make mistakes. It all just never seems to work quite as well as when Altman demos it, does it?
The picture may be different in the US but in the UK, the vast majority do not need to do tax returns at all, it's PAYE. That's simple. If you need to do self-assessment here, you are automatically starting in a place where it's more complicated, hence room for error.
Given the recent controversy over the Loan Charge (retrospective demands for tax that it had miscalculated itself, in one lump sum subject automatically to higher rates regardless of an individual's history), HMRC cannot be trusted to act rationally or reasonably over tax mistakes.
Actually, I think the biggest risk of mistakes is the missed opportunity on behalf of the user to properly reduce their tax, and likely submitting to pay too much (by missing some obscure thing about pension-child-tax-rebate-investment-credits or whatever the latest bollocks is). So you'd want a qualified human you can reasonably trust to act in your interests if you wanted a 3rd party to do your taxes.
Or, that's what I'd want, anyway. Feel free to use "GPT6" yourself, though
I mean, yes, I am talking about future models, hence my reference to "reasonable future models" i.e. things somewhat better than what exist now but not monumentally so.
I think your comment and mine also just diverge because the US tax picture is indeed very different. Every individual must have tax returns submitted on their behalf, and almost all upper middle class people (who do not qualify for free tax filing software) pay for software to do taxes that amount to punching in the right numbers in the right boxes and seeing what comes up. This is where I see a lot of adoption in the near future. Why spend potentially over $100 preparing my taxes when following a deterministic flowchart and filling in a form that looks like millions of identical forms it's already been trained on seems like a straightforward task well suited to LLMs?
Predictions that require genuine breakthroughs should be taken with a large grain of salt. Yes, we can say more smart people are working on the problem of continuous learning than ever before and that this number will increase. We can also say that it doesn't seem like it should be that hard. But if it actually just is a really tough problem requiring new thinking and new architecture, it could be decades.
there are some arguments to be made continual learning can be solved with current theoretical paradigms. I believe this is why the AI companies working on this hype it all up so much.
We don't know if they'll be correct, but there certainly are some arguments here that you can just 'scale' up a bunch of stuff and it just works.
It seems to me that the success in some fairly narrow domains has excited people about application of LLMs to much broader applications, without them stopping to ask why LLMs have these particular strengths in the first place.
I've noticed that LLMs excel at a the following tasks: text parsing and summary, solving canned problems.
Text parsing/summary plays on their abilities to read and "understand" large amounts of text. This shows up as them being useful as a search engine, summarizing a book, or rephrasing ideas in different language to help understand them.
Solving canned problems takes advantage of their vast training data, as they've probably encountered the problem before. This is especially true of "textbook problems" that make up most homework assignments and why LLMs are so good at helping people cheat. This is also where their amazing ability to write code comes from, especially simple code.
Beyond that, I've had mostly disappointment with their abilities. Presented with novel problems, or problems that don't really have solutions, they tend to flounder a bit.
But still, these are amazing achievements and I use LLMs so much every day! But I am skeptical that training harder and smarter will enable these problems to be breached and result in anything resembling ASI.
Fully agree continuous learning is a critically necessary missing piece, but pathways to cracking it seem both straightforward and likely-to-be-cracked given all the labs are prominently working on them? https://x.com/RobDearborn/status/1928287465694957875
Excellent post, as always! Your point about continual learning being a bottleneck resonates deeply with my experience building AI systems. Let me build on that insight by exploring four related challenges that I believe will prove equally thorny.
The first challenge I'd call the "telephone game problem" in multi-agent systems. When I watch information pass through chains of AI agents, I see systematic degradation that goes beyond simple errors. It's like that childhood game where you whisper a message around a circle, except now some players aren't human and miss the subtle contextual cues that would normally preserve meaning. Each handoff compounds the problem. Humans intuitively understand that the same phrase means different things when spoken by different people in different contexts, but current AI agents struggle with this nuanced interpretation.
This connects to what I think of as the "penguin-robin problem" - a conceptual granularity issue that Yann LeCun has been exploring. Large language models treat penguins and robins as equally "bird-like," while humans immediately recognize robins as more prototypical birds. This might seem like a minor classification issue, but it creates reasoning errors that compound dramatically when AI agents attempt longer-horizon tasks or try to integrate into existing human teams.
Perhaps most challenging is what we might call the "invisible knowledge problem." When our UX designer recently left, he took with him over 1,000 hours of conversations, shared mental models, and undocumented team insights that no training data could ever capture. His human replacement will need 6-12 months to reach equivalent productivity. This pattern repeats across skilled roles - enterprise salespeople often require 12-24 months to reach full effectiveness in new companies, and they're already experts at sales. The challenge of onboarding an AI "teammate" into this web of tacit knowledge seems even more daunting.
Finally, there's the trust and responsibility gap. Humans accept accountability for their decisions in ways that create both legal and cultural frameworks for collaboration. Moving AI beyond a co-pilot role requires solving not just technical problems, but social ones around responsibility, especially in high-stakes environments.
These challenges suggest AI will likely progress through three distinct phases: becoming better co-pilots across more domains (where we're seeing remarkable progress), evolving into trustworthy independent workers for isolated tasks, and eventually becoming full teammates.
Each transition requires solving progressively harder social and intelligence problems.
I've explored these ideas in more detail in a couple of posts if you're interested in diving deeper:
- https://tomaustin1.substack.com/p/ai-layers-the-nested-layers-problem?r=2ehpz
- https://tomaustin1.substack.com/p/ai-hype-the-wilson-problem-why-ai?r=2ehpz
The invisible knowledge problem is also where the real payoff is. If AIs can start to understand and use some of what is know invisible knowledge their value increases exponentially. oubly so since they can't quit, and could theoretically keep improving.
Okay, so after sleeping on this — agree with Dwarkesh that learning is a big bottleneck — and I wanted to really reflect on the "why is learning so hard" (or might it be so hard)... so, working with AI tools I drafted a little short "booklet" going back to my developmental psychology grad school roots of how humans learn vs. how AI learns and what are the open / unsolved challenges here that I see.
The key insight: we'll get impressive AI capabilities in narrow domains soon, but the deeper challenges of genuine curiosity, embodied understanding, and organic learning may take decades. We're heading toward (more and more) capable but fundamentally limited AI partners IMO.
This was fun / interesting to draft. Warning: It's very long.
https://tomaustin1.substack.com/p/b7d4d614-ae5c-4620-bb18-d94f1c7fd902
But this is also a cool example (to me) of how we can really learn and explore topics with these tools.
I have worked in IT in midsized government organizations for most of my life and my main combined hope and worry is, that what we are looking at is really that AI will take over mid level management, that is small and mid complexity project management as well as the management layers between xEO leve l management and the hands-on employees.
Essentially a web of auto updating spreadsheets and Gantt diagrams with some capacity for more advanced replanning, when called for.
I hope, because I frequently wish for smarter/superhuman abilities in day-to-day task management and the way LLMs work seem to match that type of work rather well.
I fear, because once they are in place, the inherent cynicism in (project) management frameworks will be playing to AI’s good side, whereas making team efforts come together by inspiration and leadership will be probably be playing to the bad (truth-agnostic) side, and that probably scales and reiterates badly.
To put it bluntly: We will lose the middle class buffer zone in larger organisations, and essentially turn the bell curve upside down, emphasizing drasticly the already worrying polarization of the general society between those who master the AIs and those who are the limbs af the AIs
Why would I want ChatGPT to go through my email? What an insane privacy violation for all the people I exchanged email with, who had an expectation of confidentiality - at least an implicit one.
What if the agent decides I broke the law somewhere in all the email it combs through? Will it notify the authorities? Does it have an obligation to notify the authorities?
Unless we start creating business only accounts with privacy disclaimers on all of our correspondence, this is going to take a lot longer than you imagine.
Excellent post. I think you are spot-on with the diagnosis, and are quite close on what the solution will look like -- all but dancing around it. The main claim I disagree with is "...there’s no obvious way to slot in online, continuous learning into the kinds of models these LLMs are." So let me try to convince you that there *is* one obvious way.
Human-like "continual online learning" can be found in current-day LLMs in the form of *in-context learning*. If you prompt an LLM with a few examples of how to solve (or how *not* to solve) a task, it will meaningfully improve its ability to solve it going forwards. This is exactly the effect you were gesturing at with your paragraph on how "LLMs actually do get kinda smart and useful in the middle of a session". A human-on-the-job can be understood to be learning using the same mechanism, but the entire lifetime of a human is *just one session*: the employee is receiving example after example after example, and improving each time.
The approach you propose, "a long rolling context window...compacting the session memory [into text]" is also quite close to the right approach, but falls short, largely for the reasons you describe: brittleness, terrible in some domains, etc. More broadly, a major takeaway from the arc of deep learning over the past decade is that all truly successful models are end-to-end, because gradient descent loves end-to-end and that is what allows us to scale. Any real solution must rely on huge vectors of real numbers, not brittle and tiny text summaries.
The correct solution is to use the context directly. No tricks, no hacks, no text intermediates; just place a long sequence of tokens in the context. The lifetime of an agent is one long session, where we let the model leverage in-context learning to improve.
Unfortunately, there are three issues with my solution. Firstly: the context lengths available for current LLMs are far too *short*. A million tokens sounds like a lot, but if you were to put every token seen by a software engineer across their career into a single session, you're easily looking at a context six orders of magnitude larger. Secondly: using long contexts is far too *expensive*. The cost-per-token of transformer inference grows with the amount of context used to generate that token, meaning that even if we did give a transformer a trillion-token-software-engineer context, it would be absurdly (prohibitively?) expensive to generate code with it. Thirdly, and in some ways most damningly: adding more tokens to the context *does not help*. The first few examples help a lot, but the improvement quickly tapers. Current LLMs are simply not capable of effectively utilizing ultra-long contexts (marketing-motivated claims to the contrary notwithstanding).
These issues are solvable. Not *easily* solvable -- but solvable. There's nothing fundamentally or paradigmatically wrong with the idea that we should be able to get better in-context learning than we currently get. We just need better scaling laws, meaning better architectures and better algorithms. I've been in the weeds on this problem for almost three years, and we've made a lot of progress both on understanding the best way to think about the problem and on discovering technical (architectural/algorithmic) ideas that begin to approach a solution. But it is far from solved, and ultimately I do more or less agree with your overall take on timelines.
Jann LeCun points out a 4 yr old child has harvested the same amount of data through its optic nerve since birth as all the data today's most powerful LLM is trained on. This is just visual data, it excludes proprioceptive, smell, taste, deep touch, light touch, temperature, etc and all the hormonal data flowing through our physical bodies. An expert human is just so much vastly more data rich and process powerful that an LLM could possibly be with current technology or even near future technology. A LLM needs to be exposed to vast amounts of this physical data to be able to approximate AGI, and to have the processing power, and to have the algorithmic depth and breadth, and be able to employ heuristics. I predict we are a long way off LLM AGI unless there is some profound tech breakthrough. But simply scaling current systems wont get us there.
"AI can do taxes end-to-end for my small business as well as a competent general manager could in a week: including chasing down all the receipts on different websites, finding all the missing pieces, emailing back and forth with anyone we need to hassle for invoices, filling out the form, and sending it to the IRS: 2028"
That feels like a bigger jump from present capabilities (in terms of judgement and error correction) than the jump from GPT-4 to today, and GPT-4 finished training in summer 2022. So ... maybe? But even that seems to require further acceleration.
It is possible computer use will be a lot more like robotics than people are hoping for.
I think 2032 is still too soon to expect AI that can learn on the job like a human. The "Attention" (transformer architecture) paper came out in 2017 - 8 years ago - and while we've seen massive efficiency improvements - different types of attention, KV cache, MOE, etc - we are still - after 8 years of massive research and spending - still using transformers pre-trained with SGD. The most significant "innovation" has perhaps the use of RL-post training, but that was introduced a long time back with RLHF, and is anyways an old technique.
It seems that "on the job learning" will require a shift from SGD to a new learning mechanism, which has been sought for a long time, but will also require other innate mechanisms so that the model not only CAN learn, but also wants to and exposes itself to learning situations (curiosity, boredom, etc), as well as episodic memory of something similar so that the model knows what it knows - remembers learning it (or not) - and therefore doesn't hallucinate.
I really like this post, and appreciate the focus on continual learning, which I think is vastly underrated compared to the other problems AI agents face (I wrote about the continual memory learning problem back in March: https://hardlyworking1.substack.com/p/we-are-in-the-good-timeline-for-ai.
As I see it, the crux of the problem is pretty simple: humans trade away large and capable working memory capacity in exchange for dynamic long-term memory that is adaptable and capable of continuous modification. LLMs make the opposite tradeoff: their long-term memories are entirely static, and attempts to solve agency problems have focused on enhancing short-term memory capacity via expanding context windows, which would be akin to enhancing human working memory. This leaves us with different sets of problems and proficiencies.
A few commenters here have cited RAG (Retrieval-Augmented Generation) as a possible approach for getting to AI agents. RAG requires LLMs to refer back to a specific set of documents before answering user queries, reducing AI hallucinations by combining normal LLM processes with search and retrieval processes. This will likely improve the process as a whole, but this is probably the wrong approach for making self-improving agents.
The reason why is simple; you still haven't solved the dynamic long-term memory problem. Even if your LLM extends its "working memory" so to speak, it is incapable of making precise adjustments to its "worldview" (or base weights) based on feedback, because it lacks the internal truth models that would allow it to understand what it is modifying and how its modifications would affect future behavior. You're enhancing its capability to perform long tasks, but it still fundamentally is not a *learning being* and so is not going to be doing any of the superintelligent X-Risk stuff required for short timelines that include takeoff scenarios.
Furthermore, when humans "hallucinate" (at least systematically), it's often due to faulty learning processes that have overfit on bad paradigms. PTSD, OCD, certain kinds of depression and anxiety—these can all be caused by single or repetitive traumatic effects that have fundamentally altered people's worldviews and leaves them with "trapped priors" that prevent them from properly updating their worldviews and internal truth values in response to new information that invalidates their old worldviews (https://www.lesswrong.com/posts/hNqte2p48nqKux3wS/trapped-priors-as-a-basic-problem-of-rationality). When LLMs "hallucinate", it's because they don't have coherent worldviews or internal truth values AT ALL. People say things like "humans also just do next-token prediction!" but anyone who has thought about psychology at any amount of length knows this is not true. Humans act on heuristics that can be modified in response to input. LLMs simply don't have heuristics.
As long as base code cannot be modified in situ, I don't see true AGI (or ASI for that matter) happening. How long before base code can be modified in situ? I'm not going to forecast a precise guess for timelines, but given that I've never even heard of anyone discussing this problem, I'm going to put 2030 as my very earliest timeline. In all likelihood, I think it'll probably be closer to 2040 or beyond, because finding LLM heuristics and learning how to modify them on the spot to learn in response to feedback is going to take a LOT of mechanistic interpretability work (assuming it's even possible) and mech interp is not exactly the focus of most people working on this problem.
Thank you for this. I found the article readable, compelling, and filled with a nice blend of facts and commentary that will keep me thinking on this for awhile.