What I've been thinking about this weekend - More open questions, intelligence vs power, the problem of verification in science, the parallel discovery of Darwinism
Hodge podge of things I was thinking about this weekend.
More open questions
I put out a blog prize to answer a couple of big questions I have about AI. The goal is really to find someone to hire as coresearcher. I have more questions of this variety, but I omitted them from that post’s list, because they don’t make it easy to judge submission quality. So I thought I’d post them here:
5 hyperscalers own 70+% of global AI compute, and much of that is actually reserved for the 3 member set of OpenAI/Ant/GDM. How worried should we be that AI use cases which are not building up to the singularity and the robot factories - aka normal people being more empowered, understanding the world better, being entertained, etc, is not the highest ROI activity for compute in the world. And given how valuable compute will be (whose opportunity cost increases in tandem with the quality of the AI models that run on it), will normal people basically get priced out of the benefits of AI? If we should be worried about this, how concretely should some kind of universal basic income/compute redistribution work? If not worried, what is the frame of this question missing?
Data is arguably the main way that AI models have been getting better over the last few years. But I remain confused about what concretely these improvements have consisted of. To ask some sharper questions:
Clearly Anthropic (and now also OpenAI and GDM) have cracked something about making competent long horizon coding agents. What is it? Is it just stacking up more and more RL coding environments? Or is there something more particular behind this breakthrough?
Are models even getting more sample efficient (aka they learn more from each training sample) or have we just changed/expanded/improved the data input? The reason this question is important is because it tells us how fast deep learning progress will be in domains that actually do require sample efficiency (for example, robotics).
Models are very sample efficient in context, and the information in context can be used much more flexibly. But the attention “fast” weights consume a huge amount of memory in order to accommodate this faster learning. Why is there this memory/sample efficiency tradeoff?
If you look at the size of the KV cache for Llama 3 70B, it’s 320 KB / token. If you just divide the number of bits it takes to store Llama 3 weights by the number of tokens it was pre-trained on, then you get 0.075 bits / token. So there’s a 35 million fold difference in the amount of information per bit you’re storing.
Let’s put frontier lab compute into 3 buckets: pretraining, RL generation, and inference. RL generation and inference look like very similar workloads. The big difference, of course, is that the model learns as a result of RL generation, but it doesn’t (at least currently) from inference. At the same time, the model actually does useful work during inference, but not during RL generation. Many people have pointed out it’s really weird that there’s a distinction between training and inference, and that in the limit it shouldn’t exist. How practically will these two workloads be merged? At a high level, one can imagine hiring an AI instance for a month-long work trial, getting it to do actual useful work for you during that time, and then sending a report card back to the model company. In fact, in a few years, maybe the only way that AI can continue to make progress is through this kind of on-the-job learning, because models will already have saturated anything that can be learned from contrived shorter-horizon RL environments.
Does something Y2Key happen when most of the tokens on the internet (and presumably the ones future models will be trained on) are generated by other AIs? Has the relative value of pre-2023 internet datasets increased in any noticeable way?
I wrote this in my continual learning blog post last June. Is this correct? Why might there not be a winner take all dynamic from continual learning?
“Even if there isn’t a software only singularity (with models rapidly building smarter and smarter successor systems), we might still see something that looks like a broadly deployed intelligence explosion. AIs will be getting broadly deployed through the economy, doing different jobs and learning while doing them in the way humans can. But unlike humans, these models can amalgamate their learnings across all their copies. So one AI is basically learning how to do every single job in the world. An AI that is capable of online learning might functionally become a superintelligence quite rapidly without any further algorithmic progress”
A lot of economic analysis about the impact of AGI focus on human demand - will the economy shrink because our demands can be fulfilled much more cheaply, will it rise because AI will create new varieties of products, or maybe because the relational sector will grow? But all this analyses take as a given that the only demand that matters is the one originating from humans. How do we model the machine-only economy, where the demand originates from the AI’s themselves? And once we add this consideration to our economic analysis of the future, what changes?
The mistake of conflating intelligence and power
I had an interesting discussion recently. Someone asked me, what is intelligence? I said, the ability to achieve your goals across a wide range of domains. Okay, he says, then by that definition isn’t Donald Trump the intelligent person in the person, followed quickly by Xi Jinping and Vladimir Putin?
To be clear, these people are obviously very competent in certain ways. But when you think of ASI, you don’t think of Trump, but more so. The person who kept pressing this question was correctly pointing out that my definition of intelligence was basically power (after all, what is power if not the ability to achieve your goals across a wide range of domains?). If this is your definition of intelligence, then Stalin was the most intelligent person who ever lived.
Now, of course, you could change the definition of intelligence to something more like, comprehend and build atop abstract concepts. But notice that the most powerful people in the world do not max out this quantity. They’re above average in shape-rotation, but the correlation between extreme power and this kind of intelligence might be even weaker than the correlation between extreme power and height. The physicists are not running the world.
We tend to conflate power-seeking AI and superintelligent (in science and tech) AI. I’m not denying that AI can be power-seeking. Whatever skills and drives Donald Trump has could be embodied in a digital mind. I’m simply pointing out that the way we’re currently making AI systems smarter (training them to be really good coders, thought partners, and general coworkers) is not that strongly correlated with power.
We often talk about power in this way that misunderstands how it is actually derived in our world. Our intuitions are primed by games like Diplomacy or Go, which are designed to isolate and reward a g loaded kind of strategic reasoning. But in the real world, power is more the product of having the authority and trust to get lots of people to collaborate with you, rather than some galaxy brain scheming capability. Trump is not powerful because his brain, considered in isolation, is the most effective optimization engine on Earth. He is powerful because the government which hundreds of millions of people consider legitimate gives him a lot of power.
A group versus individual level analysis is useful here. As Garett Jones has written a lot about, individual IQ is only modestly correlated with individual income, but national IQ is strongly correlated with national outcomes. This is because intelligence has a lot of spillover effects - smarter societies cooperate more, save more, and can coordinate to build things like space shuttles and semiconductors. Richard Trevithick, who pioneered the high-pressure steam engine, died in poverty, buried in an unmarked pauper’s grave. But the fact that 18th and 19th century Britain had lots and lots of people like Trevithick contributed to Britain being able to set up a global empire and defeat lots of random kings and emperors around the world. George III himself didn’t need to be a genius — in fact he went mad halfway through his reign — but the country he sat atop still defeated Napoleon, conquered India, and built the world’s dominant navy. Similarly, even if some company’s AIs are just super obedient superintelligent coders and scientists, they could help the totally pedestrian human intelligences who have their reins (lab leaders, Presidents, some harder to imagine configuration of control) gain a lot of power. It seems to me that the right mental model is that more effective AI firms and countries will outcompete everyone else in normal capitalist ways, rather than a single AI outthinking everyone else.
RLVR might be disproportionately bad at science
Next two sections I’m writing up some threads that we explored in my interview with Michael Nielson. That episode was one of my favorite.
The organizing question from my interview with Nielson was, “How do we recognize scientific progress?” It’s especially relevant to thinking about what it would take for AI to close the RL verification loop on scientific discovery. But it’s also a surprisingly mysterious and elusive question when thinking about the history of human science.
Some people have this idea that AI is going to be disproportionately good at making scientific breakthroughs. The reason they think this is that 1. Science is ‘verifiable’, 2. AI is absolutely crushing domains that have a tight verification loop - coding, math, etc - because you can RL on these loops.
But the history of human science shows that the verification loop for theories can be on the order of decades and centuries, and even then experiments do not definitely rule out alternatives: Ancient Athenians dismissed Aristarchus (2nd century BC) on heliocentrism because it would imply stellar parallax. The first successful measurement of stellar parallax was in 1838, achieved by Friedrich Wilhelm Bessel.
What we know today as the better theory can often actually make worse predictions: it’s well known that Copernicus’s model of circular orbits around the sun was less accurate than Ptolemy’s geocentric model, which had accumulated millenia of correcting epicycles. What is not well known is that Copernicus’s theory wasn’t even simpler (Ptolemy’s model interpreted the true elliptical nature of orbits using an equant trick where other planets are not moving in uniform circular motion around Earth exactly, but rather an off center point. Copernicus didn’t like this, because it violated his Platonic heuristics - so he discarded the quant trick, which led to a less parsimonious model, since Copernicus had to add more epicycles and epicyclets to make up for it.)
So in what sense was it a better theory in 1543? In some sense, it wasn’t! You couldn’t have known ex ante that heliocentrism married with Kepler’s 3 laws (1619) is a much cleaner and more accurate theory, or that there’s a very beautiful unification of heliocentric orbits and terrestrial gravity (Newton in 1686).
There was one ex ante reason that you should have preferred Copernicus in 1543: his theory required retrograde motion1 as a natural consequence of his theory, whereas for Ptolemy it was an ad hoc addition. Even more impressively, his theory, developed in 1543, actually predicted the phases of Venus2 before they were observed by Galileo in 1610. But both of these things were also implied by Brahe’s model, which had set the sun to orbit the earth and then all the planets to orbit the sun.
Under a naive falsificationist framework, you’d have to wait until Stellar parallax was observed in 1838 to know that Brahe was wrong. But obviously the scientific community was able to make progress faster than this. There is some mixture of judgment and heuristics in the progress of science that we don’t even understand well enough to actually articulate, much less codify into an RL loop.
Or consider the case of the discovery of Neptune in 1846. Uranus deviated from its predicted Newtonian path. Le Verrier predicted that an unknown perturbing planet must exist, calculated its mass and orbit, and Neptune was found almost exactly where predicted.
But the Neptune story is symmetric to a failure case. Mercury had an anomalous precession, where the ellipse that shows its orbit would rotate 43 arcseconds more per century than should be implied by the impact of other planets using Newtonian mechanics. This led astronomers to speculate that there’s an unknown planet Vulcan within Mercury’s orbit. But it was resolved in 1915 with Einstein’s General Relativity.
A proper Newtonian would still proceed with the research agenda, but modify it as follows. First, you predict some unknown planet. If it can’t be found, you say it’s so small, it must require a bigger telescope, and you build a bigger telescope. And if you still can’t find it, maybe there’s a cloud of cosmic dust occluding it. If still not found, maybe the satellite’s instruments are being screwed by some unknown magnetic field, and you send a new satellite. At each of these steps, had you discovered a new planet, or some unknown cosmic dust, or some new magnetic field, that would have been a sensational victory for Newtonians.
Ex ante, this is not unreasonable to do! It is only after decades or maybe centuries of patchwork that we can then analyze, are we simply adding epicycles, or is this theoretical framework progressive, in that it makes predictions we wouldn’t otherwise be able to.
What do these examples illustrate? That ex ante it is almost impossible to determine which research programs are progressive (will predict and explain unanticipated new phenomenon) and which are regressive (need to be contorted repeatedly to accommodate seemingly disconfirming new phenomenon).
But the verification loop is often extremely long and weirdly hostile, and even then, experiments do not definitely rule out alternatives (see the discussion in the Nielson episode about how physicist contemporaneous with the 1880s Michelson-Morley experiments thought that it simply ruled out a particular theory of ether. Only Einstein made the full conceptual leap to discard the ether altogether).
This means that big conceptual breakthroughs cannot be easily verified. They are recognized decades or centuries later, when it turns out they were much more productive than the alternatives available. What this means for AI for science is that 1. You can’t easily train an RL loop for big conceptual breakthroughs.
And 2. the society of AI scientists will still need individual AI instances that have idiosyncratic biases and heuristics, and to pursue them unrelentingly for decades on end - for example, like the one Einstein had in insisting that there shouldn’t be some arbitrary inertial reference frame. There should be dedicated people to keep a bunch of dormant research agendas alive in case they turn out to be productive upon further investigation. To understand the kind of intransigent dedication to hypotheses that is needed to preserve correct scientific idea - even in the face of disconfirming evidence - consider the following story: In 1815, Prout hypothesized that the atomic weights of all pure chemical elements are whole numbers, because experimentally, most elements seem to come out like this. But there’s many anomalies - for example Chlorine’s atomic weight is measured at 35.5. And so Prout’s school claimed that maybe the chemical substance in which these elements appeared were impure. But there seemed to be no chemical reaction that could get rid of the impurities. And then they said, maybe it’s fractions of full atomic weights - but the closer you measure, the less natural the fractions seem to get - Chlorine goes from 35.5 to 35.46. It takes until almost a century later for people to realize that these measurements are showing multiple isotopes of the same element, which can be separated physically, but have no chemical distinguishing characteristics.
What I’m trying to say is that ex ante, one couldn’t have known which research program would be more productive. We need to invest in all of them concurrently. But that investment looks like a bunch of different individual scientists being super unreasonable and obstinate about propping up their preferred research agenda.
What does the parallel discovery of a deep idea like Darwinism tell us?
The Origin of Species was published in 1859. Principia Mathematica was published in 1687, two centuries earlier. Conceptually, it seems like natural selection is much simpler than the theory of gravity. A contemporary of Darwin’s, Thomas Huxley, read the Origin of Species and said, “How extremely stupid not to have thought of that!” Nobody ever said the same for not beating Newton to the Principia. I wonder if the reason this happened is that, while Darwin’s theory is conceptually simpler, it cannot be decisively tested. The evidence is circumstantial, retrospective, and cumulative. There’s no equivalent of Newton running the numbers on the moon’s orbital period and radius, and confirming that it corresponds to his equations.
Also you need this concept of deep time. Charles Lyell published the Principles of Geology in 1830, which gave Darwin the vast stretches of time that natural selection needed. And the fact that Darwin and Wallace basically arrived at evolution at the same time3 (and both credited Lyell’s contribution) does suggest that these underrated intellectual footholds were quite important (geology, paleontology of ancient extinct species which showed intermediate species (in some cases between apes and humans), biogeography from voyages and age of colonization, more sophisticated artificial selection like pigeon breeding). It’s interesting that an idea whose essence must have been obvious to herders and parents for thousands of years actually required many millennia of ancillary intuition pumps to fully spell out.
The pattern of parallel discovery in science and technology is very interesting, and seems to contradict this vibe that certain innovations could have happened earlier much earlier than they really did.
where Mars appears to slow down and reverse direction as Earth overtakes it with its faster inner orbit
since Venus’s orbit is inside Earth’s, you should see if fully dark when it’s between Earth and Sun, and crescent halfway through, and fully lit up when it’s on the other side, aka when it’s smallest
their work was shown as a joint presentation at the Linnean Society in 1858.

