What I've been reading recently - Jan 10, 2026
Nonlinear dynamics and Chaos, Machines of Loving Grace, Max Hodak’s theory of consciousness, Neural network training makes beautiful fractals
I was recently chatting with a friend who has a similar job to mine. We were talking about how even though our jobs are fundamentally about learning about stuff, our time so easily gets sucked up by other things. So to hold myself accountable, I’m gonna try to publish a blog post every two weeks or so where I explain what I’ve been reading.
Max Hodak’s theory of consciousness
I’m totally gonna butcher this - please excuse. If you wanna get the real deal, go check out his summary blog post and his full talk on this topic.
Max is focused on two big sub-questions which together form “the binding problem”:
Mode binding: how do color, shape, texture, and motion get combined into a unified visual percept of “a red cup”?
Moment binding: why do we experience all the neurons firing across our entire brain over the course of 10s of milliseconds as a single quanta of experience?
Max thinks each of these binding sub-problems is related to a brain wave:
Gamma waves - 40 Hz - Fast, local coordination of nearby neurons to get on the same page about what they’re representing.
Alpha waves - 10 Hz - Slower waves that run through the whole brain and unify experience - think of these like the forward pass of the brain.
Two cool things about alpha waves I hadn’t realized. 1. that neurons ride the peak of this oscillation 2. when alpha waves slow down or speed up (fight or flight reactions, etc), people experience time dilation.
Anyways, Max points out that the brain is storing a bunch of structured representations about the world physically, and some feedback controller has to go in and make sure that these representations are correct. This is part of what the alpha waves are doing. And this feedback control and binding is consciousness. I’m glossing over a bunch of logical connections that I definitely don’t understand. But I’ll leave it here.
I know Max could provide a really good answer, but just talking to myself, I’m confused on what the reason is to think that feedback control = consciousness? By this logic, does memory refresh = consciousness too?
Max thinks that figuring out what’s up with consciousness will mean discovering new physics. And specifically, physics at the level of the 4 fundamental forces - some property as basic as mass or charge. His logic is that either consciousness has no real impact on the world (it’s just a byproduct of other stuff the brain does), which would be odd, or it actually has an effect, which would mean it’s new physics.
I’m not sure I buy this. 1. Can’t it be an effect that’s best understood at an implication of existing laws of physics - the fact that wood floats on water has an impact on the world, but you don’t need new physics to explain it 2. Doesn’t it seem implausible that evolution blindly stumbled upon and is now making good use of a whole undiscovered physical field which we have never managed to actually interact with using our technology, nor seen summoned anywhere else in the universe?
Nonlinear dynamics and Chaos by Steven Strogatz
I’m only 3 chapters in, so I’ve only got the building blocks so far. The fundamental idea is this. It’s often hard to anticipate how a system will evolve just by observing a bunch of different trajectories over time. But it’s much easier to see what will happen if you plot how the system will evolve from different starting points. The examples get more and more interesting, and because Strogatz focuses on the graphical and geometric interpretations, the motivating problems are super satisfying; the book is really a bunch of 3Blue1Brown videos on a certain topic stapled together.
Side note: I could not have understood anything here if I didn’t have LLMs and couldn’t watch the lectures async. I paused every minute or so (to clarify some confusion with a chatbot or to try and anticipate the next step), and I had the same section of textbook open at the same time.
I’m now wondering to myself, “How the hell did I learn anything in college at all?” I would be so lost if I was actually taking this course in college and just attending the lectures live.
In college, I actually did bounce out of a difficult course I feel like I could totally learn today with LLMs and async lectures + my adult executive function.
As I was working through these examples (some inspired by actual papers), I kept thinking about what parts the “automated cleverness” (Terry Tao’s term) of today’s AIs could actually help with.
It’s crazy how much understanding you can get about a physical system through mathematics. But that understanding is so dependent on insight and interpretation.
To give one example, Section 3.7 has a really clever model of an insect outbreak, showing how budworms, birds, and trees play out against each other given different growth rates and other dynamics.
But first you have to figure out the right dimensionless forms. And that requires judgment about which dimensions actually matter. In the insect model, the choice was to think in terms of R and K and treat the bird population as basically an artifact of those parameters. But you could have done it the other way around—from the basis of birds.
Then there’s how you make the visualization. Once you’ve got the dynamics in dimensionless form, you could just graph the equation and find the fixed points. But the result would be almost impossible to interpret. Graph it a different way, though, and suddenly the intercepts align with your intuition. You can actually see the three regimes: where carrying capacity is so low the population never gets going, where birds keep things in check, and where the outbreak has outgrown the birds’ ability to control it.
This kind of insight is inseparable from understanding what you’re even trying to learn about the system. And I’m skeptical today’s AI helps much here. When these methods were first developed, the right forms and interpretations weren’t obvious. The mathematician who wrote the original paper had to come up with new insights about how to think about the problem.
Maybe models are now good enough to apply these methods to new systems that fit the same template. But that just means the few mathematicians who invent genuinely new frameworks are the only ones who stay relevant.
Machines of Loving Grace by Dario Amodei
Starting with the biology section: Dario argues that we’ll get a century of bio progress in a few years. His argument:
Most bio progress is driven by breakthrough discoveries which give you whole new primitives for what you can measure, change, or predict (CAR-T therapy, mRNA vaccines, CRISPR, genome sequencing costs declining so much, etc).
These discoveries seem to have been made in scrappy haphazard ways, often years after they were initially possible, and often by people responsible for other breakthroughs as well. All 3 of these observations hint that they are bottlenecked by intelligence.
Dario acknowledges that data is a huge bottleneck for bio. But the tools we have for collecting data can also be expanded by intelligence. Human researchers came up with multiplexing and AlphaFold and Perturb-Seq - the AI researchers will come up with even more.
But even the human researcher breakthroughs haven’t had a huge impact on health. As I asked George Church when I interviewed him, over the last 3 decades, we’ve seen a million-fold reduction in genome sequencing costs, 1000-fold decrease in DNA synthesis costs, the development of precise gene editing tools like CRISPR, and the ability to conduct massively parallel experiments through multiplexing techniques. But it doesn’t seem like we’re curing diseases or coming up with new treatments at a faster rate now than we were 30 years ago. If anything, drug development is slowing down. Whereas with Moore’s Law - look here’s my iPhone. What explains what is going on here?
Relatedly, Jacob Trefethan has an excellent blog post makes the the argument that AI won’t speed up medical progress that much (he also steelmans the opposite point in this other post). Jacob points out that making a drug to cure something like Alzheimer’s is really hard. Raw understanding of some of the disease life cycle (which more intelligence could give you more of) is not enough. We understand that Alzheimer’s is clearly linked to Amyloid beta, and there are now many different drugs trying to remove amyloid plaques which have all not worked. Even if we get more insights like the Amyloid beta thing from AI scientists, that alone will not be enough to identify the correct targets. You just have to do a bunch of experiments on live humans.
This is why Dario’s point about clinical trials falls flat. He argues that clinical trials are currently slow because we just don’t know whether a given drug will actually work. But if we had much greater confidence, like we did with the mRNA vaccines for COVID, then we could test and approve drugs much faster. However, I don’t see why we should think that modulo the full hyperrealistic simulation of the human body, we could tell ex ante which drugs are gonna work. I don’t yet buy the argument that a million George Church clones in a datacenter could derisk all the drug trials
Quick notes on other parts of the essay:
Overall I find it pretty impressive that a tech CEO is this generally thoughtful.
The poverty and econ section doesn’t address that the main mechanism of catchup growth goes away post AGI; namely developing countries have lots of underutilized labor which is bottlenecking production, and because the marginal product of labor is high in the world today, those countries can get rich fast. So how exactly are these other countries catching up?
The key point that underlies his framework that intelligence can drive a century of progress in 5-10 years : “Things that are hard constraints in the short run may become more malleable to intelligence in the long run. For example, intelligence might be used to develop a new experimental paradigm that allows us to learn in vitro what used to require live animal experiments, or to build the tools needed to collect new data (e.g. the bigger particle accelerator), or to (within ethical limits) find ways around human-based constraints (e.g. helping to improve the clinical trial system, helping to create new jurisdictions where clinical trials have less bureaucracy, or improving the science itself to make human clinical trials less necessary or cheaper).”
it’s interesting to consider why this isn’t true for factors of production today. We live in a (relatively) capital-abundant and labor-scarce world. That is reflected in the labor share of income being 2x as high as the capital share of income. But this has been true for centuries upon centuries. Contra Piketty in “Capital in the 21st Century”, all these capital holders have not been able to get some runaway capital accumulation process going by figuring out a way around labor constraints. Why think that intelligence will be any different than capital in its ability to get around other factors of production? maybe the argument is that intelligence can actually help generate the other factors of production in a way that capital can’t.
Neural network training makes beautiful fractals by Jascha Sohl-Dickstein
Absolutely fascinating blog post.
You want to train your model at the highest possible learning rate under which it still converges. But the boundary of convergence versus divergence is fractal, which makes these hyperparameters really hard to optimize for via gradient descent.
Now you can ask the question: evolution somehow found the right hyperparameters to train our brains. How did evolution solve this wicked problem? Presumably because gradient free optimization fares better against these kinds of fractal landscapes - if you optimize for the part of the region where the average speed of convergence is high (rather than just take the gradient from a specific point that’s bounded in an unpredictable way by fractals), seems like you could do much better.
Backing up, why is the meta-loss landscape fractal in the first place? Jascha’s explanation is that fractals often emerge when iteratively applying a function. Gradient descent on the parameters is one such function that you iterate across training steps. But then the follow up question is this. There’s lots of other iterative functions you could think of, even within the context of neural networks. Do they all lead to fractals? For example:
In chain of thought, you apply a model to a string, which makes a new string, to which you apply the model, etc.
RNNs keep applying the same parameters to the hidden state.
Over conversation, an AI researcher friend revealed that CoT and RNNs both have variance problems that could well be explained by these fractal like dynamics. Though I only understand this claim at a hand-wavy level.


regarding consciousness, in my little pea brain I feel consciousness is just the OS (operating system) of the universe. its outside of our realm of observation because it in itself defines what is observable. its outside of our fish bowl, so to say.
The Strogatz nonlinear dynamics book is great. Loved taking his class in undergrad and recommend it widely.