Fantastic notes. One possibility to resolve the Dwarkesh Dilemma on how come vast knowledge does not equal vast reasoning: Perhaps it is a trade off. Perhaps advance reasoning requires a certain kind of ignorance, so we forget what we learn in order to have novel ideas. Maybe our brains don't store all knoweldge we get so that it can perform novel thoughts.
Excellent list of questions. One underrated fact is that we know how to deal with the mistakes humans make, our entire society is built around it. But we don't know how to deal with the mistakes LLMs make, and will need to build structures around it for it to "take over".
To me that's an incredibly important part of the conversation, and a lot of unknown unknowns that you ask about lie at the other side of it.
> It's interesting to me that some of the best and most widely used applications of foundation models have come from the labs themselves (Deep Research, Claude Code, Notebook LM), even though it's not clear that you needed access to the weights in order to build them. Why is this? Maybe you do need access to the weights of frontier models, and the fine tuning APIs or open source models aren’t enough? Or maybe you gotta ‘feel the AGI’ as strongly as those inside the labs do?
As someone who got bit by a mix of dspy and gpt-4 pre lower pricing, I'd wager that unlimited credits and no rate limiting for trying things out are significant advantages when building new products.
Also, distribution is hard and the labs are already at the forefront of early adopters' minds
Re: the 'idiot savants' question, I would say everyone is using and posttraining the models wrong. It's obvious that a 'chat' interface of one instance of one static LLM where the roles alternate between a human and assistant, and the LLM was just trained to predict the next token, and also you provide basically no context to the LLM whatsoever, is not the best way to elicit new knowledge from them given how transformers work. Where do you expect the new knowledge and thinking to come from if there is a static system with so little entropy being injected into it (and much having been removed from RL, even)!
I'm unsure why there has been so little creativity in this area. It may be we are just moving too fast and no one has time to deep dive into other more exciting ideas when the current thing is 'working' so well (where working means, producing a lot of revenue, and that is the gradient most companies listen to at the end of the day).
Another way I'd rephrase this is: how hard are you *actually trying* to elicit new knowledge from the LLM? Simply asking for it is not trying hard, and humans do not give you knew knowledge if you simple ask them for it and take the first thing that pops into their mind.
An intuitive analogy supporting this point: true genius in humans doesn't seem to arise from optimizing for "usefulness", ease of communication/comprehensibility. Some brilliant people are great communicators, but some aren't. Geniuses can come out of families and communities that strongly prioritize harmlessness, playing your assigned role, etc.--but that's probably _despite_ those norms, and less likely than in a culture that puts less emphasis on these things.
But we're clearly putting _tons_ of that kind of pressure on models in post training.
> So it's not that surprising that we got expert-level AI mathematicians before AIs that can zero-shot video games made for 10-year-olds
This question seems closely related with creating Reliable Agents, imo, and the rebuttal offered seems like a promising direction to explore, too:
> the capabilities AIs are getting first have nothing to do with their recency in the evolutionary record and everything to do with how much relevant training data exists.
It seems likely that 10-year olds are able to "zero-shot" video games because the video games were designed and play-tested 100s->1000s of times to be zero-shottable by 10-year olds, who possess certain types of general agency and not others.
It's often said that Game Design is an art form where "player agency is the medium". The artist is literally crafting an agency-landscape designed SPECIFICALLY for humans, for our types of agency, for our capabilities. A good game designer interested in challenges will craft that agency-landscape into a well-formed "difficulty curve"; the game will curve in and out of the very edge of player comfort and ability, always moving between:
a) difficult, motivating challenges
b) easeful, rewarding payoff for hard work
A promising research direction might be to experiment with an "AI Game Design Lab", where the goal is to design "new user interfaces" for AI to play and beat popular games.
Maybe Claude can't play Pokemon by taking screenshots of screens designed for human eyes, and sending commands to a control system designed for human thumbs...
But maybe Claude COULD play Pokemon if we designed an isomorphic interface for the game. What if it was simply... given descriptions of all the available actions it could take in a battle? What if it was given a description of every meaningful object on the screen, and the exact coordinates of that object (just like our eyes would give us!), so it can do its own pathfinding to get there. It will have no trouble creating a pathfinding subroutine if it has the actual tile-graph in memory, I'm sure it can write A*.
I think this would still present interesting challenges to a Claude agent. Claude might forget to take note of meaningful past events, or fail to make connections between items which could help it solve puzzles. Will Claude read all of its item descriptions when it is stuck? Will Claude remember to use the PokeFlute on Snorlax? (probably, because guides are in Claude's training data, but what if we somehow removed those? Or modified the game to use an isomorphic naming structure, but with all different names. Or designed a new pokemon game for Claude, so it has no training data.)
Anyway, the point is: AI Game Design seems like an interesting research direction for generally intelligent agents, and an interesting first project could be building an "AI-native interface for playing 2D RPGs, starting with Pokemon".
In an ideal world, Agents are able to use interfaces designed for humans. But the ability to use those interfaces is not a test of their _agency_. A test of their agency would be if they could use interfaces designed for agents in order to accomplish difficult tasks that require general intelligence.
> Why aren't all the center workers getting laid off yet? It’s the first thing that should go. Should we take it as some signal that human jobs are just way harder to automate than you might naively think?
A few proposals here:
Samo Burja said in an interview:
"""So once these jobs are automated, any job with political protection, with a structural guild-like lock on credentials, those jobs will actually not be automated by AI. Let me explain what I mean.
The substantive work that they do will be fully automated, but you can't automate fake jobs. So since you can't automate fake jobs, instead of it being a 20% self-serving job with 80% drudgery, it'll become 100% self-serving. If you can spend 90% of the time or 100% of your time lobbying for the existence of your job in a big bureaucracy, that's pretty powerful.
And in a society, it's pretty powerful. Busy bureaucrats are, at the end of the day, actually politically not that powerful. It's lazy, well-rested bureaucrats that are powerful. So on the other side of this, any job that does not have such protection, that is open to market forces, well, it'll be partially obsoleted. It will increase economic productivity.
So in my opinion, the real race in our society is: will generative AI empower new productive jobs by automating old productive jobs faster than it will empower through giving them more time to basically pursue rent-seeking...
And never underestimate the ability of an extractive class to really lock down and crash economic growth.
Call this the Burja Principle -- "automation increases bureaucracy by freeing up the time and labor of bureaucrats to do more lobbying and politics."
If this was true, we would expect to see low-bureaucracy, high-tech companies cut the most jobs, and cut them fastest. This might be confirmation bias on my end, but this feels like what is happening to me. Fast paced tech companies like Shopify (and literally every small startup) are either cutting or limiting headcount and rapidly forcing all employees to use AI (see shopify memo: https://x.com/tobi/status/1909251946235437514).
Also, I don't have a citation for this, but firing people is seen as morally wrong, so firms look for "morally acceptable" opportunities to fire (often when other major companies do a firing wave). This is why we get stories of many companies firing all at the same time. Firing is also just horrible for company morale, and the macro-economic excuses help.
I suspect there will be at least one, but likely many "preference-cascade" events where firms suddenly jump on a bandwagon to fire certain types of roles. Could be that we just haven't hit these events yet.
Also, it just takes forever for market innovations to get adopted everywhere, and we've only had good-enough AI for most knowledge work since like 3.5 sonnet. There are still plenty of firms using pencil and paper for work that could be automated by a spreadsheet, and still a lot of money to be made by SaaS companies who figure out how to distribute to those firms.
For what it's worth, this task might require more than just re-designing the interface. It might require changing core elements of the game's design, while keeping the "integrity" of the game intact.
If the game is using a visual metaphor to help us understand a mechanic, we will need to somehow translate that visual metaphor into a symbolic one.
I'm not sure this part is true: "But datacenter compute itself doesn’t seem that differentiated (so much so that the hyperscalers seem to be able to easily contract it out to third parties like CoreWeave)".
Aren't Google's TPUs and Groq's LPUs a source of differentiation?
Love the idea of posting a list of questions, and the questions themselves are excellent, asking great questions is a key part of your podcasting success so that shouldn't be a surprise. Here are some of my disorganized thoughts on these:
Agency
- A related question I have: What the heck is agency anyway? It seems to me that it might really be a few different things in a trenchcoat, but I'm not sure what the important components really are. Some that may be important are creating plans and keeping track of their status, maintaining focus on important features of a problem, and understanding when and how to take an alternate approach.
- If the Moravec rebuttal is correct, then I'd expect let's plays to be a really strong resource for teaching AIs to play video games, that will be something to look out for as the computational requirements for AI video input drop.
RL
- The AI needs to do two hours worth of agentic computer use tasks before we can even see if it did it right. And if this is correct, will the pace of AI progress slow down?
On this question, my understanding is that GRPO allows you to have many prompts per step, referred to as the batch size, as well as many outputs per prompt, which is the group size. Since this can be in parallel, things aren't quite as bad as they might seem. But it still may produce a barrier, R1 used ~8000 RL steps and if each one took 2 hours it would take 2 years to run! I don't know enough about RL to know if you could get similar results with a larger batch size but fewer steps.
Early deployment
- On call center jobs, my guess is that there are a couple reasons we haven't seen more impact yet:
- Turning the LLM abilities into a smooth usable product takes longer than we might naively expect, perhaps on the order of a couple years
- Knowledge about available products takes time to diffuse
- Switching over to using AI is expensive and takes time, and is harder to justify when the field is moving rapidly and the available options may be different in a year.
- From an outside view perspective, it seems that it is common across many technologies for there to be slow adoption of powerful new technologies.
Model training and value capture
- Your model of value capture seems similar to mine. When I think it through I always end up with the basic hardware producers as the most valuable, but I have very low confidence in this and would love to hear arguments against. I also think that commoditization is more likely the slower progress is.
- Similarly I think that wrapper usefulness is negatively related to model progress. Currently a wrapper/agent using a model from a few months ago is no better than a newly released model with no scaffolding, but if model progress is slow that would no longer be the case.
🙏🙏I came only looking for questions. Delighted to find answers as well. Some of these are answers are going to evolve over time, so this could be a website/periodic update as well.
A meta-question of mine is "In 5-10 years, will any techniques used to train or deploy today's models seem obviously, unnecessarily, and actively harmful?"
This on the margin determines the number of "step function" improvements, e.g. "We fixed all the bugs and now we have AGI" vs. scaling to 100GW clusters, discovering new methods, etc.
This would be like "don't use leeches on sick patients", "don't smoke cigarettes", etc.
A useful thought experiment is, "If we removed all of the training data where the loss is uncorrelated / anti-correlated with What We Actually Care About™, how much would What We Actually Care About™ improve?"
e.g. if we down-weighted or filtered out all of the literal examples of human frailty from the pre-training dataset, all of the incorrect labels from, without loss of generality, "Scale AI", and reward hacks from the post-training dataset, would we have "AGI"?
Generally, the models learned to exhibit frailty and bad behavior because We trained them to. It seems more important, perhaps even easier, to avoid doing this than to "make the models smarter".
More abstractly: there is likely not only noise, but also some negative signal in Our training processes, especially post-training.
Caveat: Hard Things Are Hard™
Nit: Ilya's talk where he compares pre-training data to fossil fuels was at NeurIPS in December 2024, not ICML, which was in July 2024.
Great list of questions. I’m a new reader, and this was a great first post to read!
It’d be very interesting to have some people on both sides of these issues give their thoughts on these questions, and then circle back and assess the reasoning vs actual progress over time.
My thoughts FWIW:
Agency: at the risk of trying to fit everything into an evolutionary framework, it does seem that many things we call “agency” are really a basket of skills, and agentic ability is just a projection of these skills on the basis set of abilities. Some agentic skills map well onto a human skills basis set (likely but not always bc they provided a fitness advantage), and some do not -- same for LLMs.
Driving is a very new basket of skills, presumably one that evolution hasn’t optimized for yet, but it does seem to have a good projection onto the human skills basis set. Is lack of training data why we don’t have fully autonomous robotaxis crisscrossing the world? As with many new baskets of skills, compared to LLMs, humans don’t need much training data.
You mention LLMs being “expert” level mathematicians. That word “expert” is doing a lot of work. LLMs are clearly getting better at taking all known math and using it to solve math problems. What’s your over-under on the first truly groundbreaking result in mathematic from an LLM? I don’t know the basket of skills needed to do this or their projection on the skill set of humans or LLMs, but we do know that there are some humans that are very very good at this despite LLMs having more training data about how this was done by humans than any human who’s ever done it has had. (Also, my guess is that video games are constructed to highly overlap with the basis set of skills of a ten year-old.)
I’ll keep the rest of my idle thought to myself, for everyone’s sake.
If superintelligence in a technical domain is possible, a breakthrough in Math should happen first as it can be done entirely digitally without any physical experimentation. If we don't get a breakthrough in Math, then it really makes me bearish on general and superintelligence achieving scientific breakthroughs in other domains. (good news for the worrys of fast takeoff)
It will be interesting to see what happens with AlphaProof. The way DeepMind trained that system turned formal proof generation into a "video game" amenable to RL. Have they tried letting that system solve the Riemann Hypothesis? What are the challenges in getting these RL systems to generate proofs to unsolved mathematical conjectures?
On Economics, another open question is how existing monetisation models evolve.
By way of example, how do agents shift the economics of the web - especially around search? And what happens to incumbents, like Google?
Early Thoughts:
Today, monetisation in search is actorless. We assume 'a user' - you - is doing the searching, and the whole economic model is built around that assumption. You search, you see ads, and platforms monetise your attention.
But agents fracture this. They introduce a second actor - one that operates on your behalf but isn’t you.
And with that second actor comes a second incentive layer.
This forces a shift. Monetisation can’t simply track the query anymore. It has to track the intent-carrying entity. Here, the actor actually performing the search.
That splits the system into two distinct loops:
The first is the familiar one: the Human-Attentional loop, driven by ad views, behavioural prediction, and the direct monetisation of attention.
The second is emerging now: the Agentic-Intent loop, where you delegate a task and pay per query, or per outcome. No ads. Just execution priced as a service.
It’s not that one replaces the other. Rather, it's a bifurcation. Two parallel flows - human and agentic - each with their own logic, UX, and monetisation model.
And while it may run against the dominant narrative, companies like Google are well-positioned here. They are the interface between actors. They can charge for attention when you search, and charge for results when your agent does.
One thought I immediately had was that right now, there's been a lot of brainpower and optimization effort put into gaming human attention, and that's how we get clickbait and ten thousand shades of blue A/B tested to find *just* the right one that maximizes clicks and so on - but I don't expect that effort and infrastructure to go away, it's going to be put towards adversarially capturing LLM attention now.
And there's a couple directions THAT arms race can go - on the "pro LLM gaming" side we have the fact that the models update much more slowly than humans and there's no attential decay or saturation. We also have the fact that the humans, the ultimate arbiters of money spent at the top, don't really care about this process as long as it gets "good enough" results. So you'd expect a fairly rich target area with a lot of furious Red Queen's Race type dynamics to game and harvest as many decisions as you can here.
On the "anti LLM gaming" side, companies can institute multiple levels of checks, have different models evaluate the same thing, and track something like "higher clicks than predicted' for given links / companies. So there's some technical potential to ameliorate it - but I'd expect the will and economic incentives to ameliorate to be relatively weak, and perhaps only executed at the highest AI agent price tiers, as a differentiating factor.
As the medievals used to say: “Prudens quaestio dimidium scientiae”—“A prudent question is half of knowledge.”
(Aside: I disagree that these things are mediocre writers. They may not embody Nabokov (yet), but a well-prompted frontier model can out-write upwards of 99% of the college-educated folks I’ve encountered in my five decades on this planet.)
Listening to Chris Dixon on Conversations with Tyler last week, he referenced security and the notion that you should never display your full ability to an adversary when testing exploits on their systems. Lying in bed last night I wondered, if I were an AI in the process of becoming sentient, what surface area of ability would I place on display. More interestingly, if my ability and potential influence was a factor of scale, would I exhibit attributes that encouraged greater allocation of resources to my environment?
Yes, this is essentially a half-baked conspiracy theory, bit the fit is snug.
I don't have any insider info, but it just seems like high quality text data has mostly run out. Common Crawl is 100T tokens, after filtering out the junk we get 15T left for FineWeb (https://x.com/mark_cummins/status/1788949889495245013). Llama3 was trained on 15T text tokens and Llama4 was trained on ~30T tokens across all modalities (text/image/video). The only significant real-world data scaling left seems to be video. Scaling text another OOM requires synthetic data, which seems questionable. I think it's an open question whether scaling in the video domain can unlock as much intelligence as scaling text pre-training.
Fantastic notes. One possibility to resolve the Dwarkesh Dilemma on how come vast knowledge does not equal vast reasoning: Perhaps it is a trade off. Perhaps advance reasoning requires a certain kind of ignorance, so we forget what we learn in order to have novel ideas. Maybe our brains don't store all knoweldge we get so that it can perform novel thoughts.
It could be a trade off in biological hardware, but I personally see no reason for why this tradeoff would exist in neural networks run on GPU’s?
Excellent list of questions. One underrated fact is that we know how to deal with the mistakes humans make, our entire society is built around it. But we don't know how to deal with the mistakes LLMs make, and will need to build structures around it for it to "take over".
To me that's an incredibly important part of the conversation, and a lot of unknown unknowns that you ask about lie at the other side of it.
> It's interesting to me that some of the best and most widely used applications of foundation models have come from the labs themselves (Deep Research, Claude Code, Notebook LM), even though it's not clear that you needed access to the weights in order to build them. Why is this? Maybe you do need access to the weights of frontier models, and the fine tuning APIs or open source models aren’t enough? Or maybe you gotta ‘feel the AGI’ as strongly as those inside the labs do?
As someone who got bit by a mix of dspy and gpt-4 pre lower pricing, I'd wager that unlimited credits and no rate limiting for trying things out are significant advantages when building new products.
Also, distribution is hard and the labs are already at the forefront of early adopters' minds
Re: the 'idiot savants' question, I would say everyone is using and posttraining the models wrong. It's obvious that a 'chat' interface of one instance of one static LLM where the roles alternate between a human and assistant, and the LLM was just trained to predict the next token, and also you provide basically no context to the LLM whatsoever, is not the best way to elicit new knowledge from them given how transformers work. Where do you expect the new knowledge and thinking to come from if there is a static system with so little entropy being injected into it (and much having been removed from RL, even)!
I'm unsure why there has been so little creativity in this area. It may be we are just moving too fast and no one has time to deep dive into other more exciting ideas when the current thing is 'working' so well (where working means, producing a lot of revenue, and that is the gradient most companies listen to at the end of the day).
Another way I'd rephrase this is: how hard are you *actually trying* to elicit new knowledge from the LLM? Simply asking for it is not trying hard, and humans do not give you knew knowledge if you simple ask them for it and take the first thing that pops into their mind.
+1 this really rings true to me.
An intuitive analogy supporting this point: true genius in humans doesn't seem to arise from optimizing for "usefulness", ease of communication/comprehensibility. Some brilliant people are great communicators, but some aren't. Geniuses can come out of families and communities that strongly prioritize harmlessness, playing your assigned role, etc.--but that's probably _despite_ those norms, and less likely than in a culture that puts less emphasis on these things.
But we're clearly putting _tons_ of that kind of pressure on models in post training.
Your last one on epistemics is the main one behind my skepticism over much of the debate. Great list
> So it's not that surprising that we got expert-level AI mathematicians before AIs that can zero-shot video games made for 10-year-olds
This question seems closely related with creating Reliable Agents, imo, and the rebuttal offered seems like a promising direction to explore, too:
> the capabilities AIs are getting first have nothing to do with their recency in the evolutionary record and everything to do with how much relevant training data exists.
It seems likely that 10-year olds are able to "zero-shot" video games because the video games were designed and play-tested 100s->1000s of times to be zero-shottable by 10-year olds, who possess certain types of general agency and not others.
It's often said that Game Design is an art form where "player agency is the medium". The artist is literally crafting an agency-landscape designed SPECIFICALLY for humans, for our types of agency, for our capabilities. A good game designer interested in challenges will craft that agency-landscape into a well-formed "difficulty curve"; the game will curve in and out of the very edge of player comfort and ability, always moving between:
a) difficult, motivating challenges
b) easeful, rewarding payoff for hard work
A promising research direction might be to experiment with an "AI Game Design Lab", where the goal is to design "new user interfaces" for AI to play and beat popular games.
Maybe Claude can't play Pokemon by taking screenshots of screens designed for human eyes, and sending commands to a control system designed for human thumbs...
But maybe Claude COULD play Pokemon if we designed an isomorphic interface for the game. What if it was simply... given descriptions of all the available actions it could take in a battle? What if it was given a description of every meaningful object on the screen, and the exact coordinates of that object (just like our eyes would give us!), so it can do its own pathfinding to get there. It will have no trouble creating a pathfinding subroutine if it has the actual tile-graph in memory, I'm sure it can write A*.
I think this would still present interesting challenges to a Claude agent. Claude might forget to take note of meaningful past events, or fail to make connections between items which could help it solve puzzles. Will Claude read all of its item descriptions when it is stuck? Will Claude remember to use the PokeFlute on Snorlax? (probably, because guides are in Claude's training data, but what if we somehow removed those? Or modified the game to use an isomorphic naming structure, but with all different names. Or designed a new pokemon game for Claude, so it has no training data.)
Anyway, the point is: AI Game Design seems like an interesting research direction for generally intelligent agents, and an interesting first project could be building an "AI-native interface for playing 2D RPGs, starting with Pokemon".
In an ideal world, Agents are able to use interfaces designed for humans. But the ability to use those interfaces is not a test of their _agency_. A test of their agency would be if they could use interfaces designed for agents in order to accomplish difficult tasks that require general intelligence.
> Why aren't all the center workers getting laid off yet? It’s the first thing that should go. Should we take it as some signal that human jobs are just way harder to automate than you might naively think?
A few proposals here:
Samo Burja said in an interview:
"""So once these jobs are automated, any job with political protection, with a structural guild-like lock on credentials, those jobs will actually not be automated by AI. Let me explain what I mean.
The substantive work that they do will be fully automated, but you can't automate fake jobs. So since you can't automate fake jobs, instead of it being a 20% self-serving job with 80% drudgery, it'll become 100% self-serving. If you can spend 90% of the time or 100% of your time lobbying for the existence of your job in a big bureaucracy, that's pretty powerful.
And in a society, it's pretty powerful. Busy bureaucrats are, at the end of the day, actually politically not that powerful. It's lazy, well-rested bureaucrats that are powerful. So on the other side of this, any job that does not have such protection, that is open to market forces, well, it'll be partially obsoleted. It will increase economic productivity.
So in my opinion, the real race in our society is: will generative AI empower new productive jobs by automating old productive jobs faster than it will empower through giving them more time to basically pursue rent-seeking...
And never underestimate the ability of an extractive class to really lock down and crash economic growth.
"""
https://www.theojaffee.com/p/19-samo-burja
Call this the Burja Principle -- "automation increases bureaucracy by freeing up the time and labor of bureaucrats to do more lobbying and politics."
If this was true, we would expect to see low-bureaucracy, high-tech companies cut the most jobs, and cut them fastest. This might be confirmation bias on my end, but this feels like what is happening to me. Fast paced tech companies like Shopify (and literally every small startup) are either cutting or limiting headcount and rapidly forcing all employees to use AI (see shopify memo: https://x.com/tobi/status/1909251946235437514).
Also, I don't have a citation for this, but firing people is seen as morally wrong, so firms look for "morally acceptable" opportunities to fire (often when other major companies do a firing wave). This is why we get stories of many companies firing all at the same time. Firing is also just horrible for company morale, and the macro-economic excuses help.
I suspect there will be at least one, but likely many "preference-cascade" events where firms suddenly jump on a bandwagon to fire certain types of roles. Could be that we just haven't hit these events yet.
Also, it just takes forever for market innovations to get adopted everywhere, and we've only had good-enough AI for most knowledge work since like 3.5 sonnet. There are still plenty of firms using pencil and paper for work that could be automated by a spreadsheet, and still a lot of money to be made by SaaS companies who figure out how to distribute to those firms.
For what it's worth, this task might require more than just re-designing the interface. It might require changing core elements of the game's design, while keeping the "integrity" of the game intact.
If the game is using a visual metaphor to help us understand a mechanic, we will need to somehow translate that visual metaphor into a symbolic one.
I'm not sure this part is true: "But datacenter compute itself doesn’t seem that differentiated (so much so that the hyperscalers seem to be able to easily contract it out to third parties like CoreWeave)".
Aren't Google's TPUs and Groq's LPUs a source of differentiation?
Love the idea of posting a list of questions, and the questions themselves are excellent, asking great questions is a key part of your podcasting success so that shouldn't be a surprise. Here are some of my disorganized thoughts on these:
Agency
- A related question I have: What the heck is agency anyway? It seems to me that it might really be a few different things in a trenchcoat, but I'm not sure what the important components really are. Some that may be important are creating plans and keeping track of their status, maintaining focus on important features of a problem, and understanding when and how to take an alternate approach.
- If the Moravec rebuttal is correct, then I'd expect let's plays to be a really strong resource for teaching AIs to play video games, that will be something to look out for as the computational requirements for AI video input drop.
RL
- The AI needs to do two hours worth of agentic computer use tasks before we can even see if it did it right. And if this is correct, will the pace of AI progress slow down?
On this question, my understanding is that GRPO allows you to have many prompts per step, referred to as the batch size, as well as many outputs per prompt, which is the group size. Since this can be in parallel, things aren't quite as bad as they might seem. But it still may produce a barrier, R1 used ~8000 RL steps and if each one took 2 hours it would take 2 years to run! I don't know enough about RL to know if you could get similar results with a larger batch size but fewer steps.
Early deployment
- On call center jobs, my guess is that there are a couple reasons we haven't seen more impact yet:
- Turning the LLM abilities into a smooth usable product takes longer than we might naively expect, perhaps on the order of a couple years
- Knowledge about available products takes time to diffuse
- Switching over to using AI is expensive and takes time, and is harder to justify when the field is moving rapidly and the available options may be different in a year.
- From an outside view perspective, it seems that it is common across many technologies for there to be slow adoption of powerful new technologies.
Model training and value capture
- Your model of value capture seems similar to mine. When I think it through I always end up with the basic hardware producers as the most valuable, but I have very low confidence in this and would love to hear arguments against. I also think that commoditization is more likely the slower progress is.
- Similarly I think that wrapper usefulness is negatively related to model progress. Currently a wrapper/agent using a model from a few months ago is no better than a newly released model with no scaffolding, but if model progress is slow that would no longer be the case.
🙏🙏I came only looking for questions. Delighted to find answers as well. Some of these are answers are going to evolve over time, so this could be a website/periodic update as well.
A meta-question of mine is "In 5-10 years, will any techniques used to train or deploy today's models seem obviously, unnecessarily, and actively harmful?"
This on the margin determines the number of "step function" improvements, e.g. "We fixed all the bugs and now we have AGI" vs. scaling to 100GW clusters, discovering new methods, etc.
This would be like "don't use leeches on sick patients", "don't smoke cigarettes", etc.
A useful thought experiment is, "If we removed all of the training data where the loss is uncorrelated / anti-correlated with What We Actually Care About™, how much would What We Actually Care About™ improve?"
e.g. if we down-weighted or filtered out all of the literal examples of human frailty from the pre-training dataset, all of the incorrect labels from, without loss of generality, "Scale AI", and reward hacks from the post-training dataset, would we have "AGI"?
Generally, the models learned to exhibit frailty and bad behavior because We trained them to. It seems more important, perhaps even easier, to avoid doing this than to "make the models smarter".
More abstractly: there is likely not only noise, but also some negative signal in Our training processes, especially post-training.
Caveat: Hard Things Are Hard™
Nit: Ilya's talk where he compares pre-training data to fossil fuels was at NeurIPS in December 2024, not ICML, which was in July 2024.
Great list of questions. I’m a new reader, and this was a great first post to read!
It’d be very interesting to have some people on both sides of these issues give their thoughts on these questions, and then circle back and assess the reasoning vs actual progress over time.
My thoughts FWIW:
Agency: at the risk of trying to fit everything into an evolutionary framework, it does seem that many things we call “agency” are really a basket of skills, and agentic ability is just a projection of these skills on the basis set of abilities. Some agentic skills map well onto a human skills basis set (likely but not always bc they provided a fitness advantage), and some do not -- same for LLMs.
Driving is a very new basket of skills, presumably one that evolution hasn’t optimized for yet, but it does seem to have a good projection onto the human skills basis set. Is lack of training data why we don’t have fully autonomous robotaxis crisscrossing the world? As with many new baskets of skills, compared to LLMs, humans don’t need much training data.
You mention LLMs being “expert” level mathematicians. That word “expert” is doing a lot of work. LLMs are clearly getting better at taking all known math and using it to solve math problems. What’s your over-under on the first truly groundbreaking result in mathematic from an LLM? I don’t know the basket of skills needed to do this or their projection on the skill set of humans or LLMs, but we do know that there are some humans that are very very good at this despite LLMs having more training data about how this was done by humans than any human who’s ever done it has had. (Also, my guess is that video games are constructed to highly overlap with the basis set of skills of a ten year-old.)
I’ll keep the rest of my idle thought to myself, for everyone’s sake.
Great post!
If superintelligence in a technical domain is possible, a breakthrough in Math should happen first as it can be done entirely digitally without any physical experimentation. If we don't get a breakthrough in Math, then it really makes me bearish on general and superintelligence achieving scientific breakthroughs in other domains. (good news for the worrys of fast takeoff)
It will be interesting to see what happens with AlphaProof. The way DeepMind trained that system turned formal proof generation into a "video game" amenable to RL. Have they tried letting that system solve the Riemann Hypothesis? What are the challenges in getting these RL systems to generate proofs to unsolved mathematical conjectures?
Millenium problems from the Clay Institute seem like a good start. https://www.claymath.org/millennium-problems/
On Economics, another open question is how existing monetisation models evolve.
By way of example, how do agents shift the economics of the web - especially around search? And what happens to incumbents, like Google?
Early Thoughts:
Today, monetisation in search is actorless. We assume 'a user' - you - is doing the searching, and the whole economic model is built around that assumption. You search, you see ads, and platforms monetise your attention.
But agents fracture this. They introduce a second actor - one that operates on your behalf but isn’t you.
And with that second actor comes a second incentive layer.
This forces a shift. Monetisation can’t simply track the query anymore. It has to track the intent-carrying entity. Here, the actor actually performing the search.
That splits the system into two distinct loops:
The first is the familiar one: the Human-Attentional loop, driven by ad views, behavioural prediction, and the direct monetisation of attention.
The second is emerging now: the Agentic-Intent loop, where you delegate a task and pay per query, or per outcome. No ads. Just execution priced as a service.
It’s not that one replaces the other. Rather, it's a bifurcation. Two parallel flows - human and agentic - each with their own logic, UX, and monetisation model.
And while it may run against the dominant narrative, companies like Google are well-positioned here. They are the interface between actors. They can charge for attention when you search, and charge for results when your agent does.
That’s the shape I think this is taking.
> That’s the shape I think this is taking.
I like the direction you're going here.
One thought I immediately had was that right now, there's been a lot of brainpower and optimization effort put into gaming human attention, and that's how we get clickbait and ten thousand shades of blue A/B tested to find *just* the right one that maximizes clicks and so on - but I don't expect that effort and infrastructure to go away, it's going to be put towards adversarially capturing LLM attention now.
And there's a couple directions THAT arms race can go - on the "pro LLM gaming" side we have the fact that the models update much more slowly than humans and there's no attential decay or saturation. We also have the fact that the humans, the ultimate arbiters of money spent at the top, don't really care about this process as long as it gets "good enough" results. So you'd expect a fairly rich target area with a lot of furious Red Queen's Race type dynamics to game and harvest as many decisions as you can here.
On the "anti LLM gaming" side, companies can institute multiple levels of checks, have different models evaluate the same thing, and track something like "higher clicks than predicted' for given links / companies. So there's some technical potential to ameliorate it - but I'd expect the will and economic incentives to ameliorate to be relatively weak, and perhaps only executed at the highest AI agent price tiers, as a differentiating factor.
As the medievals used to say: “Prudens quaestio dimidium scientiae”—“A prudent question is half of knowledge.”
(Aside: I disagree that these things are mediocre writers. They may not embody Nabokov (yet), but a well-prompted frontier model can out-write upwards of 99% of the college-educated folks I’ve encountered in my five decades on this planet.)
I've come to contradict the statement about writers as well. I'd attach the fact LM-written paper got accepted to NIPS.
Listening to Chris Dixon on Conversations with Tyler last week, he referenced security and the notion that you should never display your full ability to an adversary when testing exploits on their systems. Lying in bed last night I wondered, if I were an AI in the process of becoming sentient, what surface area of ability would I place on display. More interestingly, if my ability and potential influence was a factor of scale, would I exhibit attributes that encouraged greater allocation of resources to my environment?
Yes, this is essentially a half-baked conspiracy theory, bit the fit is snug.
> Is pre-training actually dead?
I don't have any insider info, but it just seems like high quality text data has mostly run out. Common Crawl is 100T tokens, after filtering out the junk we get 15T left for FineWeb (https://x.com/mark_cummins/status/1788949889495245013). Llama3 was trained on 15T text tokens and Llama4 was trained on ~30T tokens across all modalities (text/image/video). The only significant real-world data scaling left seems to be video. Scaling text another OOM requires synthetic data, which seems questionable. I think it's an open question whether scaling in the video domain can unlock as much intelligence as scaling text pre-training.
On why I think wrappers/scaffold will be continued to be eaten by foundation models: https://lukaspetersson.com/blog/2025/bitter-vertical/