On Embeddings

The Geometry of Meaning

One of the strangest things about recent machine learning is how much meaning can be packed into a vector of numbers. You feed a neural net a mountain of data, and it produces a high‑dimensional atlas of words and images where the hop from “Japan” to “sushi” is the very same geometric stride as the hop from “Germany” to “bratwurst.” It feels like cheating. Earlier AI tried to define meaning with rules and symbols—painstaking, brittle work. Now, you just let the machine loose, and meaning seems to condense out of the data like dew.

Why does this work? Part of the answer is that neural nets are relentless at noticing patterns. They don’t care about grammar or logic; they just chew through examples. If “dog” tends to appear near “bark” and “leash” and “walk,” the machine will notice, even if you never explain what a dog is. These embeddings are not definitions; they’re fingerprints left by use. But for most of language, that’s enough. Children don’t learn words from dictionaries, but from hearing them in context. Machines do the same.

But the real surprise is that these blobs of numbers don’t just capture similarity; they capture relationships. You can subtract “man” from “king,” add “woman,” and get something close to “queen.” This isn’t magic; it’s geometry. The neural net has arranged the space so that certain concepts line up along certain directions—gender, royalty, whatever patterns the data contains. The result is a kind of map of meaning, where analogies become vectors. Early AI researchers would have killed for this.

Of course, there are limits. These vectors inherit all the blind spots and biases of their training data. They can mangle idioms or miss rare meanings. And unlike humans, they can’t explain themselves. Their “knowledge” is just a position in space, not a chain of reasoning. If you want to know why something is true, embeddings can’t help you.

Still, it’s remarkable how far you can get without explicit rules. Meaning, it turns out, doesn’t need to be boxed up in definitions. It can be spread out, smeared across a cloud of examples. Maybe that’s how our brains work too—less like a dictionary, more like a map of associations, fuzzy and overlapping.

You can try this out yourself with the handy embeddings projector. On the data column in the top left, select Word2Vec All. Put target words on the right and isolate to 101 points to see what I mean by these neighborhoods of meaning.

The implications are huge. If meaning comes from context, then more data means richer meaning. This is why the biggest language models work so well: they’ve seen more, so their maps are better. But it also means they’re only as good as their data. Feed them garbage, and you get garbage geometry.

There’s a kind of irony here. Early AI wanted to make thinking precise by encoding knowledge in symbols. But it turns out intelligence is a lot messier. Meaning likes to hide in clouds, not boxes. The best representations we have so far are just big, tangled spaces of numbers—statistical shadows of the world.

Embeddings aren’t perfect, but they’re the closest machines have come to thinking the way we do: not by reciting rules, but by navigating a landscape of patterns. Maybe that’s what intelligence is—not logic, but location. Where you are in meaning-space determines what you know.

We’re only beginning to understand what’s possible with these maps. Maybe the next step is to combine them with something more explicit—let them build the map, and then reason on top. Or maybe we just need bigger, better spaces. Either way, embeddings have shown us that meaning is less about definitions than we thought. It’s about finding your place in the cloud.

Reply

or to participate.