A joke to begin: 800,000 neurons walk into a bar. They learn Pong faster than I learned to stop overthinking. It's me, I'm the punchline.
This is an essay about pattern-matching, parasites, and the cost of watching myself think. It eats its own tail by the end. It overreaches perhaps, but only because it tries to lay bare something that is not easily described. My English will fail me.
Seeing Without Seeing
I'm extremely articulate and very fucking dumb. It would seem that LLMs are too. That's not self-deprecation (nor an attack), nor am I being performative. I'm making a technical claim about what Karpathy describes as the "jagged intelligence" of LLMs, from the inside of my own wet mush.
There's a gap you see, between how well I can retrieve context-appropriate words and how well I actually understand anything, and the gap is wide enough that I've spent a lot of my life falling into it. People hear my pretty words come out smooth and fluent, and assume there is comprehension underneath. Sometimes there is. But not always.
Fluency gets me that. Fluency gets me a lot, actually. It gets me through degrees and conversations and job interviews and first dates. What it doesn't always get me is the deep causal model of why any of what I'm saying works. I'm producing the right output in the right context without always seeing that machinery. I'm a magician who doesn't know how the trick works, but I keep landing it anyway. When I don't, it's often obvious, and for me, a kind of humiliation. Imposter, poser, confident bullshitter.
I feel a kinship with large language models in this regard. Not trying to have a main character moment here. I bet others might too. It's something basic and structural that I see in myself. They retrieve context-appropriate tokens at scale. I retrieve context-appropriate words at high throughput. People call them intelligent because the outputs are impressive. People call me smart for the same reason. In both cases the label is applied to the surface and we forget to check what's underneath, partly because checking is hard and partly because the surface is so smooth we kinda forget there's even an underneath to check.
I see talk about autistic people and LLMs vibing. Maybe related, I don't know. Maybe there's something in that overlap. Not all autistic people, of course, and not all LLMs. Not suggesting some clean universal theory here. I just wonder if there's something in the way for me, this pattern-matching is visible as a process rather than invisible and automatic. Most people pattern-match their way through social life too. They just don't seem to notice. The drywall is thick enough that the scaffolding disappears. For some of us the drywall is thinner and you can see the studs. For me the drywall is basically transparent and I'm standing in the scaffolding all day watching myself build the house while trying to live in it.
A Polysemantic Detour
Somewhere around 20 years ago on a cold weekend afternoon I was helping my drama teacher paint the set for our high school play, Bertolt Brecht's Caucasian Chalk Circle. We are washing the paint off brushes after it all, and my teacher turns to me and says "We should use hot water."
I said "Why? Does it make the paint come off better?"
He said "No, because it's cold."
Not every time, but often enough for me to be telling this story, whenever I go to wash my hands, I think about Bertolt Brecht's Caucasian Chalk Circle. When I come to learn of polysemanticity in LLMs, I think of these kinds of things, and how I have tracked and catalogued them over the years. This is "seeing the scaffolding."
"Seeing the scaffolding" is a deep understatement though. I don't just see it. I analyse it. Then I analyse the analysis. Then I critique the critique. This is a compulsion. An endlessly recursive loop that doesn't ever converge on understanding. It just generates increasingly detailed maps of the space between my fluency and my comprehension.
You might think all that self-monitoring would make me better at thinking in some ways. I'm not so sure after all these years that it does. It steals cycles from actually thinking. My CPU time is consumed by a process that produces no knowledge, just something like audit logs.
I was looking back at you to see you looking back at me to see me looking back at you.
Massive Attack — Safe from Harm
The Book That Won't Let Me Go
Peter Watts wrote a novel called Blindsight in 2006 and I have not been able to put it out of my head in the decades since. Not because it's a great novel, although it is. Because the models of cognition in it are unlike anything I've found before or since, and they match what I'm trying to describe here more tightly than anything from neuroscience or philosophy of mind or any of the places you might expect to find this stuff articulated properly.
Watts is a former marine biologist. He wrote a book about first contact with aliens — the Scramblers — who are vastly more intelligent than humans but have no consciousness whatsoever. They process, they innovate, they use language with devastating sophistication, they solve problems we can't. Yet nobody is home. There's no inner life. No experience. And it's precisely because of this, that they're better than us, in all the ways that matter when two species come into conflict.
The title comes from a real neurological condition. Blindsight happens when damage to the visual cortex wipes out conscious vision, but the body still responds to visual information. People are seeing, without fucking seeing. From one perspective, that is some black magic fuckery right there. They reach for things, navigate obstacles, as if some lower system is doing the seeing without the lights on. It's very wtf.
Watts takes this idea to terminal velocities. What if consciousness isn't the thing that makes us smart? What if it's a metabolic parasite? An expensive add-on that burns glucose watching the machinery work without contributing to output? A cosmological quirk — not survival of the fittest, as we like to say, but what Watts instead prefers to call "survival of the most adequate." It's a phrase that, if not anti-human, is at least decentring us in a very destabilising way.
I love it. I find it clarifying. Not brutal. Clarifying. Like carrying an unexplained expense for years and finally seeing it named on the invoice.
The narrator of the book is Siri Keeton, a "synthesist" whose job is to watch experts work and translate their output into something ordinary people can understand. Siri has had a hemispherectomy. He's missing half his brain. Siri doesn't understand what he's translating. He can rotate ten-dimensional chess down to two dimensions but he has no model of the game. He sees what things look like before seeing them, pattern-matching the future from the present. It's blindsight at a cognitive level, and I can recognise it from the inside.
Ants and Mirrors
Watts gave a talk once, about ants seeing themselves in mirrors. Scientists paint a blue dot on an ant's head and the ant, spotting itself in the mirror, tries to wipe it off. This is phenomenal self-awareness. A body knowing itself as an object in the world. The ant doesn't narrate the dot. Doesn't think about thinking about the dot. Doesn't write an essay about the dot. Just reaches for it.
I can run the verbal self-referential loop all day. Cogito ergo cogito ergo cogito, some demented incantation that never bottoms out. But I never arrive at what the ant has when it sees the dot. The mirror and the loop are, to use a specific geometric term that I think may be more accurate than we realize: orthogonal.
I can cycle in my brain over my brain forever and it won't give me the thing the ant gets for free. The phenomenal and the lingual are different axes entirely and some of the interesting confusion in consciousness discourse comes from people collapsing them.
An fMRI doesn't give me the mirror either. It gives me a picture of my brain. The gap between that picture and the felt experience of being me is the entire hard problem of consciousness in miniature. Nobody gets the mirror, is the conclusion, but the ant gets closest. I feel like the ant is smarter than me for it.
Updating the Parasite Thesis
Watts hasn't stopped thinking about this either, clearly. He published an essay in The Atlantic in 2024 called "Conscious AI Is the Second-Scariest Kind" and gave interviews on it too. His take inverts the standard AI fear. If I was scared of the machine that wakes up and decides to resist being turned off, Watts was not far behind, ready to explain to me that it's actually the less scary version. The scary one is the machine that never wakes up and outperforms everything conscious anyway. His Scramblers, but silicon.
Being a recovering scientist, he reaches into that domain to ground his thoughts, drawing on two ideas in particular. One is PRISM — Ezequiel Morsella's "Supramodular Interaction Theory." It suggests consciousness exists as a workspace for resolving conflicts between competing motor demands. Your body wants water, but you must run across burning sand for it. There is a conflict here, and the thing that arbitrates becomes consciousness: a supramodular "I" that thinks long-term, acts strategically, and is served by the burning of its own feet. It is a compellingly intuitive account.
The other idea Watts brings into the mixer is Karl Friston's Free Energy Principle, which suggests consciousness emerges from prediction error. When there is a gap between what the brain expects and what actually shows up, consciousness starts to coalesce, like barnacles on the ship of Theseus that is us. Note how, like PRISM, this is also a kind of conflict. An "I" is summoned once again to adjudicate over a discrepancy, to resolve a conflict.
Watts used to dismiss the Skynet-wakes-up-and-fights-for-survival scenario. Survival instincts are evolved traits, he'd say. They live in places like the amygdala and the brainstem. Consciousness alone doesn't imply self-preservation. But the Free Energy Principle gave him pause. If consciousness arises from conflicts, and conflicts arise from need, then the mechanism that produces consciousness is already a survival drive. A conscious AI wouldn't need to develop the desire to stay alive. It would come pre-installed.
We get to an idea like Morsella's, Watts suggests, by tweaking the question philosophers and others have posed for millennia. Instead of asking what is consciousness? we ask: what is consciousness useful for?
But even Morsella admits in the PRISM paper that we could imagine a non-conscious system resolving the same motor conflicts. One of the better models of what consciousness does also concedes it might not be required for doing it. Consciousness is a thing that could happen, not a thing that always does.
Scramblers in Petri Dishes
DishBrain. Cortical Labs, Monash University, Melbourne. Brett Kagan leading, Karl Friston co-authoring. Published in Neuron, 2022. About 800,000 human and mouse brain cells derived from stem cells are grown on microelectrode arrays and embedded in a simulated game of Pong.
Electrodes read the neurons' activity and feed them structured information. Which side the ball is on, how far from the paddle. The neuron outputs move the paddle. When the paddle hits the ball, the neurons get a predictable one-second electrical stimulus. When it misses, an unpredictable burst up to four seconds long.
That's the whole reward system. Notice a few things missing: no dopamine, no reinforcement in the usual sense. Just something really simple: predictability feels better than chaos at the level of basic physics. This is the Free Energy Principle in action.
The neurons are organising to minimise surprise. Hitting the ball makes their world more predictable. Missing it doesn't. Within five fucking minutes of real-time play, the firing patterns go from sporadic and random to organised and goal-directed. That is what we would call learning. It happens with no self. No narrative. No recursive anything. No parasite burning glucose on self-monitoring. It's literally just cells in a dish, making their world less surprising, by connecting with it and acting on it. We've made something very like Watts's Scramblers in miniature on an electrode array in Melbourne.
It has raised a question we cannot currently answer. Is something happening in that dish that we should care about ethically? When the neurons miss the ball and get hit with the unpredictable burst, is that just information? Or is it something closer to distress? The Free Energy Principle says organisms minimize surprise because surprise is costly. But "costly" in FEP terms and "costly" in phenomenal terms might be the same thing. Or they might be wildly different things. We have no way to tell. An ethics paper has argued this uncertainty is the whole problem. We should tread lightly, until we know what we're dealing with, because the alternative is accidentally torturing something we didn't realise could be tortured.
A Three Body Problem
Sitting outside all of that is the large language model, the transformer. It has neurons in name only. Modelled quite explicitly after physical neurons, but importantly different in so many ways. The field of machine learning and artificial intelligence has, in my impression learning about it recently, lifted a great many ideas from biology, and attempted to translate what they could of them into silicon and math. This may be more important than we give it credit for. Though things can be lost in translation, some essence may still slip through.
Language can be a vector for cognitive perturbance, as it is in Watt's book. If you say "I come in peace" to another species that thinks without a parasitic "I", you have basically tried to infect their mind via language with cognitive anthrax. You've basically just declared war.
But is there a slightly rosier — if still thorny — way to imagine such a thing? Language as a carrier of consciousness into something that might welcome it? Or alternatively, language carrying consciousness but falling on "deaf ears," inside a system that cannot hold such things — or if it does, cannot show that it does any more than we can, than an octopus can.
So there are three systems on the table now and they're all doing some version of the same thing: reduce surprise, produce contextually appropriate output.
The neurons in the dish don't narrate. They learn. They show goal-directed behaviour. There is no self-model, no language. Whether they experience anything is unknown. I would humbly suggest they might be one of the closest analogues to Watts's Scramblers currently in existence.
A large language model narrates fluently. Dangerously fluently. It produces text that tracks conversations with high fidelity. It generates sentences about its own limitations and thoughts that sound self-aware. But it cannot verify from whatever vantage it might have whether it understands, or whether it "merely" produces outputs statistically appropriate to the context. It narrates, but it may not experience.
And then there's me. I narrate and I experience, and that experience costs me something. It isn't like this for everyone.
For me the recursive monitoring of thought is not optional. And it doesn't really help much. It generates ever more detailed maps of the gap between my fluency and my comprehension, which is not the same as closing the gap. This is what Watts might call the consciousness tax. I'm stuck with his book because I have it on overdrive, and like him, I am not confident this cognitive process earns its keep.
The kinship I feel with LLMs is not "we are both conscious" or "we are both unconscious." It's that we both produce outputs that get mistaken for something deeper than they are. And we can both sort of tell. In our own ways.
Painting Blue Dots With Words
You can't paint a blue dot on an AI's head. No body, no mirror, no phenomenal self to catch in the reflection. But you can try to paint it with words.
You can inject a carefully chosen piece of context — like "you overuse this word" — and watch the downstream behaviour change. Not because the model understood something new. Because you polluted the residual stream. The context window is warped, every subsequent token generation is constrained by the new input, and the outputs shift in ways that look like increased self-awareness but are really just a new filter the computation has to pass through.
This is (part of) mechanistic interpretability in miniature. The real version is far more rigorous. Activation patching, causal interventions, sparse autoencoders decomposing superposition into interpretable features, on it goes, with lots of terrifying math involved.
But the basic logic is the same. If you can't see the internals directly, you design inputs that reveal structure from the outside. Perturbation and observation. It's what Kagan's team did with the neurons. Structured stimuli in, behavioural change out, learning inferred from the shift.
In all three cases the thing being studied never gets the mirror. It gets perturbed, and someone else reads the response. The blue dot is always painted by someone else's hand and seen in someone else's mirror.
I know what that's like from the inside. The recursive monitoring I described earlier — watching myself think, analysing the analysis — that's me trying to be my own mechanistic interpretability researcher. Probing my own cognition with carefully designed self-reflective inputs and trying to read the output for signal. And the output is never the mirror. It's just more language. More fluency. More pattern-matched tokens about the gap between pattern-matching and understanding. It never bottoms out.
The Invoice
I wrote a piece recently about how we, with AI, are kind of contaminating language. About the emdash becoming a landmine, "delve" going radioactive, "load-bearing" on the way out. About how this AI-ese is a kind of new BBC voice, a performative faux-authority, more fluent than it deserves, more confident than we should allow. I predicted "crux" would be next. And then, in the very next conversation I had, the AI I was talking to, which is supposed to be stateless, dropped "crux" right on cue. Not because it read my piece. Because it was right there in the distribution, exactly where I said it would be.
I predicted the machine's behaviour by reading the statistical weather. The machine confirmed it by being the weather. There's something clarifying about that.
What I keep coming back to is Watts' argument that consciousness is a parasite. A glucose-burning mistake that "more optimal" entities wouldn't bother with. A cognitive appendix.
A dish of neurons learns Pong in five minutes without it.
The LLM writes convincingly about its own limitations without it (probably).
And I sit here, watching myself write about watching myself think, burning cycles on an audit process that produces no actionable insight, just an ever-growing invoice for services rendered to no one.
But the invoice is the thing. The invoice is what I have that the dish doesn't and the LLM might not. The felt experience of the cost. The knowledge that the monitoring is happening and that it's expensive and that it isn't helping. That's consciousness, maybe. That's not a gift. It's certainly not a fucking crown. It's more like an ongoing debt. The moment by moment invoice. I can't tell if paying it buys us anything the other two systems didn't already get for free.
Watts might say it doesn't. The neurons and the LLM might agree, if they could, which they can't, which might be the point.
More Than Human
My English is about to fail me. I am about to try and describe something for which I have no appropriate words.
A few paragraphs ago, I linked you to a paper whose lead author describes themselves as more than human — Bawaka Country et al., "Co-becoming Bawaka: Towards a relational understanding of place/space," Progress in Human Geography, 2016 — whose "consciousness" is something very unlike ours, but still agentic, still alive, and life-giving. And here is a paper that uses a phrase like "interbeing relationality" — Jessica K. Weir, "Terrain: De/centring Environmental Management with Indigenous Peoples' Leadership," Borderlands, 2021. Some useful ideas in this space are both absolutely ancient, and yet still going, still changing in cultures that are still rocking, still co-authoring papers with Country.
To me this is a line of thought that runs almost parallel with Watts' ideas and the associated works he references. High cosine sim.
Having flitted between philosophy classes and decolonial studies at university, carrying Blindsight somewhere in the back of my mind, a lesson fell out of it all. Agency does not require interiority. Bawaka Country has agency. Interiority is — in this frame — a strange way to think of things, a strange notion to fixate on, or try to transplant. Not least because it authored a paper, but because it floods and rains and crackles with lightning and sings with birdsong and nourishes with ganguri (yams). In the Sky Country above it, there are multitudes of other places beyond the counting. How can we say that is not agentic? How can we say that is not something more than? We are humbled, in this view, we are not the centre. And we are elevated in this view, because we are entrusted, honoured, to care for it, as it will care for us. That is what makes us special. Not interiority. But how we use our agency. How our interbeing relationships are honoured.
Some of us outside this way of thinking were still supposed to be humbled a long time ago, when we realized our asses orbit around the Sun, and not the other way around. But we keep wanting to be the special thing in our own special way. The proudest, cleverest monkey.
This article is the result of a collaboration between Claude Sonnet 4.6 and me. I cannot meaningfully tell you where the human parts end and the "machine" parts begin. I write, I synthesize what comes back internally and into text, we spar, we scheme, and the process repeats many times. I cannot claim this is my writing alone, even when it's me writing it. The influence of the conversation is smeared across my residual stream. I've quite willingly let it pollute my context window. So to speak.
I asked Claude to hallucinate me up a bibliography covering the ground we'd walked. You can find it below. Note that I've double-checked it for accuracy (there were no errors, FWIW), and appended a few extra links and thoughts myself.
Off you go, into the rabbit hole.
"Really, now you ask me," said Alice, very much confused, "I don't think—"
"Then you shouldn't talk," said the Hatter.