Does the Machine Think in English?

Russian speakers are faster at distinguishing light blue (goluboy) from dark blue (siniy) than English speakers — because Russian draws a categorical boundary there that English doesn't. It's one of the cleaner findings in decades of linguistic relativity research: the words you have shape the perceptual distinctions you make, at least at the edges.

Now ask yourself: what language does GPT-4 "think" in?

The question sounds whimsical. But it cuts to something real. If the language you speak shapes the concepts you can fluently deploy, and if large language models are trained primarily on English-dominant text, then the structure of English — its categorical boundaries, its grammatical choices about time and space and agency — is baked into every conceptual association these models build. That's not a fixable bias. It's a fundamental feature of what language is and does.

The question is whether it's the same thing that happens in a child's mind — or something else entirely.

What Linguistic Relativity Actually Claims

Let's be precise, because this field is haunted by its most extreme version. The strong Sapir-Whorf hypothesis — that language determines thought, that speakers of languages without a future tense literally cannot conceive of tomorrow — has been largely discredited. Thought can clearly outrun language; preverbal infants are capable of remarkable conceptual achievements.

The moderate version holds up. Language is a lens: it makes certain distinctions salient and habitual, speeds up certain categorizations, and leaves others fuzzy at the margins. The Russian blue result is one example. Another: Guugu Yimithirr speakers, who use absolute compass directions (north, south, east, west) instead of egocentric ones (left, right), are extraordinarily good at tracking their orientation in unfamiliar environments — because their language demands it constantly. Children learning spatially-oriented languages develop different frameworks for representing location than children learning languages that don't grammaticalize those distinctions.

The mechanisms are well-studied. Language acquisition isn't just vocabulary acquisition — it's the gradual internalization of a conceptual grammar, a set of mandatory distinctions that get encoded so deeply they begin to feel like perception rather than categorization.

How Children Get the Lens

The critical developmental question is when and how children acquire linguistically-shaped categories.

It happens earlier than you'd expect, and the social context matters enormously. Research by Endevelt-Shapira and colleagues (2024) found that the quality of mother-infant social interaction at just three months — long before children say their first words — predicts productive vocabulary at 18 to 30 months. Conversational turn-taking at three months predicted language outcomes at 27 to 30 months. The social scaffold precedes the linguistic content.

What this means for linguistic relativity is subtle but important. Children don't acquire language-as-categories by passively absorbing input. They acquire it through contingent, responsive interaction that makes relevant distinctions socially meaningful first. The Guugu Yimithirr child doesn't just hear spatial language — she grows up in a community where navigating by absolute direction is a practical, socially-embedded skill that everyone around her exercises.

Language shapes thought because language is first shaped by people.

The LLM Version of the Problem

Here is where things get strange. Large language models are, in a very real sense, trained on more language than any human being will encounter in a thousand lifetimes. They have been exposed to the Russian color distinction, the Guugu Yimithirr compass terms, the Mandarin aspect system, and the entire scholarly literature on linguistic relativity. They can describe the Sapir-Whorf hypothesis with admirable accuracy.

Do they have linguistically-shaped categories in the relevant sense?

A landmark 2024 paper in Trends in Cognitive Sciences by Mahowald and colleagues draws a crucial distinction that bears directly on this question. They separate formal linguistic competence — mastery of the statistical structure and grammatical patterns of language — from functional linguistic competence — using language to actually think, reason, and represent the world. LLMs are extraordinarily good at the former. They are systematically deficient at the latter (Mahowald et al., 2024).

The implications for linguistic relativity are almost inverted from the human case. In humans, language shapes thought from the inside — the categories get internalized until they feel like perceptions. In LLMs, language is both input and output, but the "thought" that would be shaped is absent. The formal shell of the Russian color distinction exists in the model's weights. The perceptual sharpening it produces in a Russian child's visual system does not.

Portelance and Jasbi (2024) offer a useful methodological frame here: neural networks can serve as probes for identifying candidate mechanisms in language acquisition, but the linking hypotheses needed to connect model behavior to human cognition are doing enormous work. When a language model handles Russian color terms differently than it handles English ones — as some do — we can't simply conclude it has acquired a Whorfian color space. The formal patterns are present; the grounding is absent.

Training Data as Linguistic Relativity, Writ Large

Here's the part worth worrying about.

If LLMs don't have linguistically-shaped thought the way humans do, they do have something adjacent: their conceptual structure — the associations, analogies, and semantic neighborhoods encoded in their weights — is heavily shaped by the distribution of their training data. And that distribution is not linguistically neutral.

English-language text dominates. But the problem isn't simply one of representation statistics. Different languages carve up the conceptual space of causation, time, space, agency, and social relationship differently. A model trained primarily on English-dominant corpora will have English-dominant conceptual geometry even when it is "speaking" another language. Its semantic space will reflect English regularities — not because it "thinks" in English, but because those are the regularities it was optimized to model.

Gopnik, Farrell, Shalizi, and Evans (2025) argue that LLMs should be understood not as intelligent agents but as cultural technologies — analogous to writing or print — that allow humans to leverage the accumulated information of a culture. On this view, the linguistic shaping of LLMs isn't cognition being bent by language; it's a cultural artifact built out of language, inheriting all of that culture's categorical emphases and blind spots.

That's a useful reframe. It means asking "does the machine think in English?" is probably the wrong question. The right question is: whose conceptual world is encoded in the training data, and who gets left out?

The Developmental Divergence

There's an empirical twist that brings this back to children. Even if we set aside the philosophical question of what "thinking" means for a language model, the learning trajectories of LLMs and children diverge in ways that matter for anyone using these tools in educational contexts.

DevBench, a 2024 NeurIPS benchmark by Tan and colleagues from the Frank Lab at Stanford, is one of the few tools designed to compare vision-language model performance directly against children's developmental data — not adult norms, but item-level behavioral responses from children at specific ages. The finding is illuminating: models that perform better on language tasks do more closely resemble human behavioral patterns overall, but they fail to replicate the ordering of human developmental milestones (Tan et al., 2024).

Children don't acquire language capabilities in the sequence that optimizes final-stage performance. They follow a developmental path shaped by cognitive constraints, embodied interaction, and the social scaffolding of caregivers. Models optimize for the endpoint and skip the path. This is exactly what you'd expect if the mechanism of acquisition matters, not just the outcome — and linguistic relativity research suggests it does.

The lens through which a child acquires their first spatial prepositions isn't just vocabulary. It's the product of months of embodied navigation, pointing games, and adults directing attention with words that map to lived experience. The model gets the text. It misses the pointing.

What This Means in Practice

For AI practitioners building or evaluating multilingual systems: the linguistic relativity frame suggests that semantic similarity metrics, analogy probes, and concept boundary tests can reveal where a model's implicit categories diverge from local linguistic frameworks. This isn't merely a translation problem. It's a conceptual geometry problem — and it's worth treating it as one.

For educators deploying language AI tools with young children: the developmental data is a useful corrective. Children's language learning follows a path structured by social interaction and embodied experience, not text prediction. A system that excels at generating fluent prose has optimized for formal competence, which is only part of what language development is. If you're using AI-assisted language tools in early childhood settings, consulting with a speech-language pathologist or developmental specialist about how the tool scaffolds category formation — not just vocabulary exposure — is worth doing.

For anyone just curious: the next time you encounter a language model that speaks excellent Russian, ask it about the blues. It will have all the words. What it will not have is the perceptual sharpening that growing up with goluboy and siniy as distinct basic categories produces in a Russian child's visual system.

That gap — between having the word and having the concept shaped by the word — might be the most compact description of what separates language models from language users.