Some major changes happened in my life recently:
Yes, I’ve been lifting, thank you for noticing.
We’ve also had a baby daughter!
She’s one of the reasons why I haven’t been writing much lately. It’s not that I’ve been overwhelmed with work — my wonderful mother in law has helped out and the baby spends most of her time sleeping or eating anyway, neither of which require extraordinary effort on my part.
But I used to get anxious if at the end of the day I had accomplished little work, and sometimes that anxiety would push me to write late into night to make up for it. Now at the end of each day, regardless of my productivity in other domains. I feel happy and satisfied that I’ve kept my child alive for another 24 hours. And what else do I need?
This post is a follow-up to my essay on Action Ontologies and Computer Ontologies. Go read that now and leave that tab open for later to browse some of the great writing on the Metaculus Journal.
My Metaculus essay focused on how our brains construct the reality we inhabit, and how that reality may be different for an AI. It expanded on three main points.
The first is that not a single neuron in our brain is labeled “tomato detector” or even “redness detector” or even “vision”. Our brains are born with a particular architecture (e.g., some neurons happen to be hooked up to light-sensitive cells in the retina), but all the components of our perception down to the very basics are merely inferences we learn over time.
Second: our brains evolved to guide action and keep our bodies in metabolic balance, so the model of the world our minds construct depends not only on our eyes but also on our hands and even our blood vessels. An artificial mind that doesn’t operate in a human body may require and develop a different model of reality with an entirely different ontology, even if it was pursuing a similar goal to us like driving a car.
A consequence of these two points is that “reality” itself is just a perception and an inference. A tomato on a table seems real because you can pick it up with your hands and absorb its nutrients in your stomach. The image of the tomato above doesn’t seem real because you can’t do either with it, even though the screen image generates the same light waves hitting your eye as a real tomato would. The reality you perceive in real objects is a property of the map, not the territory.
Permission to Speculate
These three points may seem wild and unintuitive, but they’re pretty well grounded in the contemporary science of how our brains work and how they produce the contents of our consciousness. I elaborated on them (with references) in the Metaculus essay so I won’t do so here.
The rest of this post is going to use those three points to speculate wildly on baby consciousness and AI, two topics I’m not an expert on but am learning a lot about every day, whether I want to or not.
Here’s roughly the conversation I had before writing this post with a friend who has a background in developmental psychology:
— I’m curious about the development of a newborn’s subjective experience. For example, what are the prerequisites for a child perceiving a word of separate and real objects?
— How would you even know what’s a baby experiences? We prefer to focus on what the baby can do and react to.
— So your baby behaviorism can’t disprove me if I speculate wildly on newborn consciousness?
As for my thoughts about AI, I ran them past a few friends who work in the field. One told me that they’re obviously true, one that they’re clearly false, and the rest shrugged. So this post isn’t quite in clown on fire territory, but someone should probably check the clown for smoke just to be sure.
World From Scratch
You’re a brain trying to predict its own state, and all you have is some innate wiring. How do you start? By noticing simple patterns, and then building up to complex ones.
A simple pattern: some neurons seem to fire together regardless of what else is happening in the brain. Perhaps when (unknown to you) the light goes on in the room all the neurons connected to the retina experience a burst of activity, which propagates to other neurons in the visual cortex but not to the insular cortex. Your brain infers that they belong to the same sense modality (in this example, one you’ll later call “vision”). Learning to predict the state of your visual cortex neurons (for example, but anticipating the state of one part based on the state of another) is learning to see.
Once vision is established as a sense, your brain notices that retinal cells right next to each other often receive the same input — our visual field is composed mostly of monochromatic surfaces and not of random visual noise. The simpler patterns are combined into more complicated ones, such as edges and surfaces and then faces and other objects.
Some patterns seem very important. For example, an experience of hunger tends to take over the entire brain and it correlates with observations such as finding yourself crying loudly and flailing your limbs. Hunger propagates an error signal throughout the brain, as if an important built-in expectation of the brain’s state is being violated. Anything that correlates with hunger gets more attention paid to it.
This is how the baby learns about the most important thing in its world: its mom. Roughly every three hours from the moment a baby is born, it receives a highly-correlated multi-sensory signal that consists of face, voice, smell, warmth, milk, and the sensation of hunger subsiding.
Mom-recognition is so important is comes long before self-recognition. At a month old, my daughter responds to my wife’s voice but is seemingly unaware that the fist punching her nose sometimes is her own. Infants recognize themselves in the mirror only around 18 months of age, but they recognize their primary caregiver on sight at 3-4 months.
A child builds up their world model from observing regular patterns, but this only works for simple things that form a clear cluster (like human faces). We live, however, in a world of complex narratives and mental constructs. The way humans learn these concepts is not through mere pattern detection but through language. In particular, through concept-teaching speech from their primary caregiver.
Here’s an excerpt from How Emotions Are Made by Lisa Feldman Barret which explores this in great detail:
The developmental psychologists Sandra R. Waxman and Susan A. Gelman, leaders in this area of research, hypothesize that words invite an infant to form a concept, but only when adults speak with intent to communicate: “Look, sweetie, a flower!” […]
Fei Xu and her students have demonstrated this experimentally by showing objects to ten-month-old infants, giving the objects nonsense names like “wug” or “dak”. The objects were wildly dissimilar, including dog-like and fish-like toys, cylinders with multicolored beads, and rectangles covered in foam flowers. Each one also made a ringing or rattling noise. Nevertheless, the infants learned patterns. Infants who heard the same nonsense name across several objects, regardless of their appearance, expected those objects to make the same noise. Likewise, if two objects had different names, the infants expected them to make different noises. […]
From an infant’s perspective, the concept “Wug” did not exist in the world before an adult taught it to her. This sort of social reality, in which two or more people agree that something purely mental is real, is a foundation of human culture and civilization. Infants thereby learn to categorize the world in ways that are consistent, meaningful, and predictable to us (the speakers), and eventually to themselves. Their mental model of the world becomes similar to ours, so we can communicate, share experiences, and perceive the same world.
How Emotions Are Made goes on to argue (quite convincingly) that our emotions such as anger and pride are not innate but are concepts taught to us by others when we are children. Consider that what unites such disparate instance of “anger” as a child screaming and tossing toys around and an adult coldly seething in the presence of their boss are mostly mental inferences about the goals and subjective experience of the angry person, not any objective “signature” in their appearance or sound.
After childhood we keep on learning concepts from other people (mostly from Scott Alexander) and these concepts make up the world we perceive. But getting this process jumpstarted requires a very particular sequence of steps:
- The chaotic mind of a newborn
- Telling the senses (internal and external) apart
- Recognizing basic percepts in key senses
- Recognizing primary caregivers through the significant multi-sensory experience of interacting with them
- Learning to pay particular attention to your caregivers’ speech among all the ambient sounds and separating it into words
- Identifying “concept teaching mode” in adults’ speech and using these learned concepts to acquire distinct percepts
One important thing about this sequence is that it’s very particular to human children growing up in a society of humans. Nothing remotely like this exists for dogs, or octopuses, or for any current AI architecture. This doesn’t mean that AIs can’t learn concepts in principle, but it means that if they do they’ll have to learn them in a very different way from how humans manage it.
GPT is a Wordcel
A long time ago in a land far far away, a team of AI researchers wanted to train a text-prediction AI. They fed it relatively unfiltered text produced by a wide variety of people on the internet. The resulting model was very good at predicting the sort of text people would write online, including all their biases and bigotry and “misinformation”.
The mainstream journalists in that land, who were generally hostile to and fearful of emerging technologies, demanded that language models be trained only on text free of all bias and falsity. That is, that they be trained only on the recent contents of mainstream journalistic publications. The resulting language model was able to write excellent columns for mainstream papers but was useless for any other task. And so the people of that land fired their journalists and replaced them with GPT, and all lived happily ever.
Ok, so GPT-3 can apparently write newspaper articles. But we already knew that these often consist of nothing but fnords and a randomly arranged set of affect-laden words. Could GPT-5 write a Scott Alexander essay? Could it actually think in concepts, combine them in original ways, and invent new ones that refer to things in real world?
This is either an existential question or merely a trillion dollar question, depending on whether you’re an AI pessimist or optimist. (What, you couldn’t destroy the world and/or make a trillion dollars if you had access to a billion silicon Scott Alexanders running at hyperspeed?) I can’t give a definite answer to this question, of course, but I can offer a few intuitions in both directions.
The first thing you could ask is, isn’t GPT-3 close enough already? My intuition is that it isn’t, because I can write like GPT-3 and I can also write in a completely different way and those feel qualitatively different and produce different outputs.
GPT-3 predicts missing words or phrases given some context, which is something that humans can do easily on autopilot. “Emma walked into the park and saw ____ and then she ______”. “Elon Musk’s statement was condemned by ____ who said that it ______”. As Sarah Constantin noted, humans can also skim-read this sort of text, whether generated by GPT or an ad-libbing human, and they get the general gist without noticing gaps in logic. This sort of writing, stringing together symbols that kinda fit together but don’t actually convey much substance, was the original meaning of the now-viral word wordcel.
I can do GPT-style wordcelry and also a different type of writing, one in which I think of ideas that I then translate to words. This process requires concentrated System 2 attention to both write and parse, while GPT wordcelry can be done by System 1 alone. Individual sentences or paragraphs can then be filled in automatically, making up a large portion of the text perhaps, but you can’t generate an entire Scott Alexander essay on autopilot.
But will GPT-N be able to do it even if GPT-3 can’t?
The argument in favor is that the GPTs have been getting better and better at generalized text prediction simply through increasing the number of model parameters and token in the training data. “Scott Alexander wrote about the surprising connection between Nazifurs and heterotic string theory, explaining that ____’ seems like the sort of text prediction task GPT is getting really good at without requiring a wholly new architecture.
Why would that change? Going from simple patterns to complex, abstract ones we see that:
- Even older models are basically perfect at stringing letters together in a word, aka spelling. You only need to see a word in text a few times to learn how it’s spelled.
- GPT-3 rarely makes mistakes in stringing words together in a sentence, aka grammar. You probably don’t need more than a few dozen examples of a word in a sentence to figure out how the word fits grammatically.
- GPT-3 is fairly good at stringing sentences logically together in a paragraph. It got much better at this than GPT-2, an improvement that required going from millions of words in the training corpus to hundreds of billions.
- GPT-3 doesn’t yet do a good job stringing paragraphs together with purpose to say anything new and meaningful. How many words of training data will it take to get there?
The answer to the last question may be “only a few more”, in which case GPT-4 will take over this blog and most human writing jobs in the near future. Or the answer could be “orders of magnitude more words than humans have written so far”, in which case better language models would have to do something other than brute forcing their way through undifferentiated text dumps. They’ll need “learning shortcuts” telling them what to pay attention to and what to ignore.
Human babies have this shortcut. When a caregiver (identified through their consistent multi-sensory impact on the baby’s body budget) utters the syllables “look honey, it’s a…”, the baby knows that the string of sounds it will hear next is worth paying much closer attention to than the million other sounds it has heard that day. For AI, this shortcut would probably have to be built as opposed to emerging naturally from bigger and bigger training sets.
Will AI Infer Reality?
Speaking of Scott Alexander essays, I was surprised by this section in his review of the Yudkowski—Ngo AI debate, emphasis mine:
I found it helpful to consider the following hypothetical: suppose you tried to get GPT-∞ — which is exactly like GPT-3 in every way except infinitely good at its job — to solve AI alignment through the following clever hack. You prompted it with “This is the text of a paper which completely solved the AI alignment problem: ___ ” and then saw what paper it wrote. Since it’s infinitely good at writing to a prompt, it should complete this prompt with the genuine text of such a paper. A successful pivotal action!
I disagree that GPT’s job, the one that GPT-∞ is infinitely good at, is answering text-based questions correctly. It’s the job we may wish it had, but it’s not, because that’s not the job its boss is making it do. GPT’s job is to answer text-based questions in a way that would be judged as correct by humans or by previously-written human text. If no humans, individually or collectively, know how to align AI, neither would GPT-∞ that’s trained on human writing and scored on accuracy by human judges.
Humans aren’t born knowing that physical reality exists just because we live in it. We have to slowly infer it, and eating the
wrong right mushroom can even make us forget this fact temporarily. An AI born and raised in the world of human text could in principle learn to infer that physical reality is a thing and what its properties are, but it’s not a given that this will happen.
The main goal of this post isn’t to make strong claims about what AI might or might not do, but to dispel the anthropocentrism of how many people (including me until quite recently) think about possible minds. Humans think in concepts and perceive ourselves occupying physical reality, and we take these two things for granted. But we weren’t born doing either, as the newborn on my lap can attest.
A parting thought: you just read two posts that seemed full of ideas and concepts. Did I manage to actually convey something meaningful to you or did I just wordcel 5,000 nice-sounding words together? How would you be sure?
5 thoughts on “Artificial Wordcels”
Congratulations on the tiny human! Please tell her that your readers say hi, and that we hope she has an okay time being a baby even though being a baby can be very hard.
LikeLiked by 1 person
A) I’m probably biased by the kind of writing I’ve been doing by the kilogram recently, but I think an important milestone for GPT would be if it could write a convincing research grant proposal. This may be the opposite of wordcel-writing: the big challenge is in System 2 actually building a convincing and logical experimental plan and decision tree. The words are just there to help the reviewer see your logic, an inferior alternative to dragging the reviewer into your lab and grunting at the microscope.
A text generator that can actually write something like that will go a long way to convincing me that somewhere in its internal structure is some real domain-specific logic, learned by examples, and it’s not just a Chinese room. Plus, there are already tons of training material available!
B) My interpretation of the Free Energy Principle leads me to think that it’s not unique to living brains. I see it as a general heuristic that is useful when you’re navigating a million-dimensional world, trying to maximize on one dimension (for living things – allostasis and reproduction, for AI – whichever goal you reward it for), when you don’t know apriori the shape of the landscape you’re confined to (for humans – cause-effects in the real, physical or social world, for AI – cause-effects between parameters and output), and you only get partial information at each step.
If the FEP underlies/generalizes predictive processing and consciousness, then I wouldn’t be surprised if a sufficiently high-D AI has internal processes that look pretty predictive and, uh, conscious.
C) Re: your last paragraph. Whenever I read a fancy new model or theory, I generally update much stronger if it includes contrasts with competing models, along with ideas to empirically distinguish between them (don’t have to be realizable…). This helps convince me that it’s actually a different set of constraints for how reality works and not just wordcelry repackaging old/obvious ideas to look new and profound.
Does this imply that what self-driving AI needs before it can be successful is not just the ability to identify a human, but to also model that human’s mind & intent? That sort of brings us back to Scott’s recent post on Biological Anchors — how much time before self-driving AI has that amount of compute power?
These posts remind me of a book that I read but didn’t understand all that much of, Probably Approximately Correct. It seemed like what the author was trying to get at is an idea of how we learn as much as possible from as little information as possible. Some amount of that knowledge is already encoded in our bodies/genes. (The Biological Anchors are still relevant)
The last section (being correct vs. being judged correct by humans), is interesting as well. As Paul Graham among others has pointed out, a lot of really “good” ideas are the ideas that everyone else thinks are bad ideas — so an AI can’t just be infinitely useful by being infinitely good at coming up with ideas that everyone else agrees with, at some point it has to branch out into noticing good ideas that other people would think are crazy.
Agree w/ Beny that this series of posts is roughly somewhere in between pure-wordcel and pure-shape-rotator.