The world’s most popular family of languages has its roots in a small corner of Europe. Its evolution – a story of migration, conquest, intermarriage and human progress – has lessons for how we see ourselves.
We take it for granted that we have nationalities, national languages and countries with borders, and that these are, to a large degree, fixed. Yet nation states are only considered to be a couple of hundred years old. Before that, we dwelled in multinational, often multiethnic and multilingual states and federations. And languages don’t respect borders, leaking into other countries, either wholesale or, through dialects, by degrees.
But in recent years, thanks to developments in genetic analysis, archaeology and linguistic analysis, new theories have emerged that suggest our ideas about many of our ancestors – who spoke the languages of Europe, the Iranian plateau and northern India (languages now spoken by more than three billion people) – and the kinds of people they were are misguided at best. These emerging theories throw into shade rigid ideas of race and identity.
In her new book Proto: How One Ancient World Went Global, science journalist Laura Spinney writes: “You can have a national identity; you can consider that your culture, you can even consider your linguistic identity if you speak a national language. But your genetics is a mix of huge amounts of movement in the past.”

Spinney makes it undeniable that every person from the European and Near East region is a mix of ancient peoples, their language a rich blend of fragments, echoes and forgotten tongues. “The idea of monolingualism is very recent – it’s since the invention of the nation state,” says Spinney, who is British but lives in Paris and has a French passport.
“For the vast majority of the history of former sapiens, we are multilingual. And there’s some evidence we’re reverting to that state: that even at the post-imperial European end – Australia, New Zealand, etc, nations with an imperial history and with a national language – they are becoming more multilingual.”
At the heart of this new story is Proto-Indo-European, or PIE. It’s the distant ancestor of most European and Indo-Iranian languages: French, Greek, Latin, Sanskrit, Russian, English. And it took root about 5000 years ago on the Pontic Steppe, a grassland region stretching from modern-day Ukraine to Kazakhstan, according to the most popular theory. It was probably a cluster of dialects. These herder people became known as the Yamnaya.
Spinney, whose last book was about the 1918 Spanish Flu that killed millions, explains how those nomadic people spread from their homelands across Asia and Europe, over time largely displacing the farmers and hunter-gatherers who had occupied Western Europe for thousands of years.
The story is one of migration, adaptation, climate change and disease. Some time between 4000BCE and 3500BCE, she writes, the first wheeled vehicle trundled into the steppe. So transformative was the wheel, it spread so widely so quickly that archeologists can’t tell where it was invented.

The steppe herders were already middlemen in a long network: ore could be moved from mine to smelter, salt from estuary to storage. The first roads and river crossings weren’t far from appearing.
Before the Yamnaya, herders stuck close to their ancestral river valleys. The Yamnaya moved with the seasons, living out of oxen-pulled wagons or tents. Their clothes were made of skins and furs, they burnt manure for warmth. Whereas milk was consumed rarely before the Yamnaya, ancient dental plaque reveals it became central to their diet. They ate meat – sheep, goats, horses, possibly cows – alongside wild plants such as barley and sorrel. It’s not clear they drank alcohol; they may have preferred cannabis “since they scattered its charred seeds all the way into Europe”.
The Yamnaya traded with others for materials like honey and copper ore, but otherwise were largely self-sufficent, smelting copper to make bronze, and forging their own weapons and tools.
This mobile economy enabled them to harness the vast pastoral potential of the steppe, argues Spinney, utilising sheep, goats and cattle for food, fuel and textiles. Their bones and teeth testify to this.
They grew significantly taller than their ancestors, and some lived into their 60s and 70s. Their numbers, probably in the hundreds at first, grew hugely. And the dialect of those few hundred people – concentrated in the now-embattled east of Ukraine if the oldest Yamnaya sites are a good indication – may have given rise to all the known Indo-European languages.
How do scientists know the Yamnaya might be the source of PIE? It began with Italian poet and writer Dante, he of Divine Comedy, who noticed the similarity of the Romance languages. Others noticed similarities between Germanic languages. Sir William Jones, the multilingual, Calcutta-based British judge and promoter of “Orientalism”, theorised that Sanskrit, Latin and Greek had sprung from some common source, which possibly stretched to Germanic, Celtic and Iranian languages.

“It was a huge sort of electric shock to everyone and really gave a boost to the interest in this family,” says Spinney. “And pretty much from the beginning, the idea was there must have been a common ancestor from which these languages are all descended. So the quest was on for that common ancestor, and for where and who spoke it.
“But the earliest written records of Indo-European are Hittite, a now-dead language that was spoken in Anatolia [modern Türkiye]. The very first written mentions of Hittite are around 2000BC; you have to wait basically another 500 or so years before you see the next Indo-European languages written down. That’s ancient Greek and Sanskrit. And roughly another 1000 years before you see Latin.”
Historical linguists eventually teased out 12 main branches of the Indo-European language family: Anatolian, Tocharian, Greek, Armenian, Albanian, Italic, Celtic, Germanic, Slavic, Baltic, Indic and Iranic, each with its own world of culture and thought. Some non-Indo-European languages survived the spread of the steppe people, such as Finnish, Hungarian and Basque.
Vocabulary clues
Eight billion humans speak about 7000 languages from about 140 families. But most speak languages that belong to just five: Indo-European, Sino-Tibetan, Niger-Congo, Afro-Asiatic and Austronesian. Indo-European, says Spinney, is by far the largest language family the world has known.
The language had a rich vocabulary for dairying, including cow, sheep and milk. There were words for wool, honey (*mélit, the source of French miel and English mellifluous), though oddly not for bee. And there is one for plough, two words that might have meant wheel, and others for the likes of axle – so most linguists think the speakers of PIE knew wheeled transport. People who knew wheels and wagons could not have lived before 3500BCE, when they were invented.
The only people who herded horses, cattle and sheep at that time were the Yamnaya. They grew no crops – confirmed by the few reconstructible words for farmer-related vocabulary and their lack of tooth decay from starchy cereals.
The Yamnaya dialect of eastern Ukraine may have given rise to all the known Indo-European languages.
The lexicon reveals something about Indo-European societies. The word for “wed” means both to marry and to lead away. The word pótis (a reconstructed word) meant both husband and master (despot comes from this). There are words for a wife’s in-laws but not a husband’s, so a woman probably moved into her husband’s household. Wealth – which narrowly meant livestock – could be inherited down the male line or stolen: there are several words relating to raids and booty.
Alliances were built through ghostis (another reconstructed word), which combined both guest and host. Hospitality was duty-bound to be returned, and was extended to strangers, who could be friend or foe. The idea of ghostis might have enabled PIE speakers to pass through each other’s territory, says Spinney.
Scholars have reached a broad consensus, she says. The Yamnaya spoke PIE, and “every time we speak or write in a descendant of that language, we unwittingly scatter clues as to who those nomads were and what they believed. They live on through all of us who speak Indo-European today”.
The third wave
In 2015, using different methods to analyse ancient DNA, two papers confirmed a third major influx to Europe after the hunter-gatherers and farmers. Migrants had radiated east and west from the steppe about 5000 years ago, and in Europe, their ancestry had replaced up to 90% of the gene pool. Genetically, modern Europeans remain overwhelmingly part-hunter-gatherer, part-farmer and part steppe nomad, says Spinney.
“No later movement had anything like their genetic, cultural or linguistic legacies – not the massive migrations set in train by the fall of the Western Roman Empire, nor the displacements that followed the Black Death, the 1918 flu or either of the world wars. Most European men alive today, and millions of their counterparts in Central and South Asia, carry Y chromosomes that came from the steppe.”

Spinney’s book – a roaming tale that traverses linguistic terms such as satemisation, ruki rule and boustrophedon – explains how the Yamnaya went no further west than Hungary and the former Czechoslovakia. (A group of the Yamnaya also went east – it’s been confirmed the Afanasievo herders of south Siberia and Mongolia’s Altai Mountains were genetically almost identical.)
About 2900BCE a new culture arose, Corded Ware. They were genetic cousins to the Yanmaya, and similar in many ways. Their overwhelmingly male migration was fast and furious: within 50 years they moved to Poland then Denmark, often burning forest to graze their herds as they went.
Somewhere near the Rhine, they met a group of people who travelled up from Iberia. These were the Bell Beaker people, named after their highly decorated cups. They almost certainly spoke a non-Indo-European language, says Spinney. The two interbred and their descendants spread into the furthest corners of Europe, probably speaking Indo-European languages. They arrived in Britain about 2450BCE and Ireland a couple of hundred years after that.
One Beaker immigrant to England, the Amesbury Archer, was buried not far from Stonehenge, surrounded by arrowheads and the earliest gold objects found in England. Based on his tooth enamel, he came from central Europe.

In a 2018 book, Harvard paleogeneticist David Reich wrote that British and Irish skeletons from the Bronze Age, that followed the Beaker period, show the people who built Stonehenge were 90% replaced by those with Yamnaya ancestry. They made additions and adopted Stonehenge for their own rituals.
The Yamnaya culture vanished from the archaeological record about 2500BCE. A hundred years later, the climate in Europe began to cool again and the Corded Ware culture faded. By 2200BCE, parts of Europe were experiencing severe drought, says Spinney, and it was the turn of the Bell Beakers to disappear, though their culture persisted for longer in Britain and Ireland. A couple of centuries later, the climate and drought had eased and the great movements of people began to settle down.
“The dominant subsistence mode was farming, and people were mostly sedentary, but for all that things looked similar on the surface, the social, economic and biological changes had been profound,” says Spinney. “Genetically, the population now resembled that of modern Europeans, with its three-way mix of ice-age hunter-gatherer, Near Eastern farmer and steppe herder. People kept larger herds and wore woollen clothing. Lactose tolerance was spreading and meat represented a higher proportion of the diet.”
Will of the few
Was the massive genetic turnover in Europe 5000 years ago violent? The Yamnaya weren’t spectacularly violent, says Spinney, despite headlines when the studies came out that they roared west in an orgy of violence. Traumatic injuries were rare in graves, as were weapons. Their Corded Ware relatives were probably more violent, tearing across the top of Europe with stone battle axes. The Beaker folk contributed to the social and linguistic transformation of Europe, but they were only half-steppe by ancestry.
By fair means or foul, says Spinney, the migrants bred with local women and prevented local men from passing on their genes. Rape, murder, even genocide could not be ruled out. “The dilemma has always been, if it was them who brought the languages in, how on earth did they – let’s say, maximum, tens of thousands of people at the height of their immigration – impose their languages on a population of farmers we estimate at roughly about seven million?”
The Indo-Europeans may simply have been good at having children and keeping them alive.
The process on the continent may have been slower than the six generations it took for Britain’s genetic makeup to change after the arrival of the first Beaker people. Plague may also have played a part – geneticists are fairly certain a dangerous epidemic swept Europe in the late Neolithic (about 10,000BCE-2000BCE), halving the population by some estimates. Steppe migrants, who lived close to their herds, may have brought it in but been largely immune.
The migrants may have been intimidating enough without violence, argue anthropologists including German Martin Trautmann. Ancient historians say the Germani and Celts were on average 6cm taller than Roman centurions. Yamnaya men were on average 10cm taller than the male farmer they ran into. They would have been heavier set, had lantern jaws and deeper voices than their lighter-boned counterparts.
Most scholars agree violence wasn’t the main driver of Indo-European success. Much more important were their social institutions, left in physical traces and in echoes of the Indo-European languages.
Boys were often fostered out from age seven, a woman moved into her husband’s home and her sons were sent to her kin. A boy might receive military training from his uncle/foster father. At about 14, elite boys were sent to the wilderness, returning at 21. They formed lifelong alliances, and ties of guest-friendship. Polygyny, a man having many mates, was the norm in the ancient world.
The Indo-Europeans may simply have been good at having children and keeping them alive, says Spinney.

The rise of English
As the Indo-European languages continued to spread and break into “daughter” languages largely unintelligible to each other, one language would eventually come to dominate the world: English. The Germanic family’s only territorial gain after the fall of Rome was England. In the 5th century, large numbers of Low German-speakers from what’s now Denmark headed along the North Sea coast and across the English Channel. Some British Celts fled north and west, implanting Welsh and Cornish, but the majority, says Spinney, took up the immigrants’ language. They spoke German with a Celtic lilt, producing Old English. (The rule of thumb is that, Greek aside, languages become new ones roughly every 500-1000 years, which is why Old English is largely unintelligible to modern English speakers.)
The Britons probably switched under duress. Bede, an English monk, wrote a few centuries later that the Angles, Saxons and Jutes “began to increase so much, that they became terrible to the natives”. Old English bears traces of this. The German word for foreigners, including Celts and Romans, was walhaz, it is hypothesised. In Old English the word became wealh, which is the root of “Welsh”. It still meant foreigner, specifically a Celtic-speaking Briton, but also acquired the meaning of “slave”.
In the 8th century, another wave of Germanic-speaking migrants started to arrive: the Vikings. But their linguistic impact was smaller. It may have been that Old Norse and Old English were still similar enough that their speakers could communicate without bothering to learn each other’s language.
In the 11th century came Norman French. “The Normans were only half a dozen generations removed from their Viking roots and relatively recent converts to French themselves (hence ‘Norman’, a corruption of Norðmenn or ‘north men’),” Spinney writes. “They still spoke it with a Nordic accent, and it was their form of French that became the new language of prestige in England, though never the language of the English.” (French, she says, was what came out of the mouths of Germanic-speaking Franks when they took up Latin, “and not classical Latin but the vulgar form already bent out of shape by the Gauls”.)
Ethnic divisions are losing meaning. It’s becoming harder to see the difference between ‘them’ and ‘us’.
Given that about 1.5 billion people now speak English, Australian linguist Nicholas Evans has predicted it’s unlikely to go the way of Latin and split into the Romance languages. But with speakers exposed to spoken and written English through television and the internet, it may settle into a state of diglossia, where there’s a gulf between spoken and written versions.
There are many questions of culture and language still to answer, says Spinney. Not least, what caused a band of herders to spread out from the steppe and invent a new way of life and send their genes and language through Europe. And where did their language itself come from? Sites in Ukraine may still provide answers, including one called Mykhailivka, source of the oldest Yamnaya samples to date, which is reported to be dry again after being flooded when the Kakhovka Dam was destroyed in 2023.
New archaeology and genetic tools can tell us much more. Migration has been a constant, “indigenous” is relative. 10,000 years of human displacement have shrunk the genetic distance between populations to the point where ethnic divisions are losing meaning, Spinney writes. “The desire to belong is as strong as ever, and as it becomes harder to see the difference between ‘them’ and ‘us’, linguistic and cultural boundaries will be guarded more jealously.
Language will become a new battleground in the identity wars, and preserving our linguistic ‘purity’ a justification used by those who want to raise walls. Unfortunately for them, the most successful language the world ever knew was a hybrid trafficked by migrants. It changed as it went, and when it stopped changing, it died.”
PROTO: How One Ancient Language Went Global, by Laura Spinney (Harper Collins, $36.99), is out now.
What did Proto-Indo-European sound like?
Very little like English, says Laura Spinney, and closer probably to a mix of Sanskrit, Greek and Latin.
“The most powerful god in the ancient Indian pantheon was Father Sky. His name was Dyauh pita, literally ‘sky father’ in Sanskrit. For the Greeks, the chief deity was Zeus pater, but they eventually dropped the pater and just called him Zeus. The Romans deformed the initial dy sound to give Iuppiter, which became Jupiter. In Old Norse the d morphed into a t so that the Vikings knew him as Tyr, while in closely related Old English he was Tiu. Tuesday is the day of the week that English-speakers dedicate to Father Sky.”
The language was highly inflected, having case endings and what linguists call ablaut, in which you express tense in a verb by changing the core sound. English has retained a few examples, like song, sung, sang. And it contained many sounds that English doesn’t use.
Proto-Indo-European was never written down, but linguists have devised a system of recording the sounds they think it used. It is thought to have employed the vowels a, e, i, o and u in long (ā) and short (ă) forms. Also the semivowels, y and w; the nasals m and n; the liquids l and r; the sibilant s; and other consonants: labials p, b, bh; dentals t, d, dh; two kinds of dorsals (in which the back of the tongue touches different parts of the roof of the mouth) k, g, gh; labio-velars kw (a “rounded” k, a bit like English qu) gw, gwh. The h after another consonant indicates it is aspirated. There were also laryngeals, a type of consonant preserved only in the Anatolian branch of the family. Somewhere between a vowel and a laryngeal was schwa (ǝ), which is also common in English (it is the last sound in “agenda”). In Proto-Indo-European, the stressed syllable is thought to have been of higher pitch than the others (as it was in Ancient Greek) rather than louder (as it is in English).
How did Celtic get to Ireland?
Linguists are fairly sure the Bell Beaker people brought with them an Indo-European language, but the dates suggest it wasn’t Celtic. The Beakers were gone from Britain and Ireland by 1800BCE, even if their genetic legacy continued.
Proto-Celtic was born no earlier than 1500BCE. It may have been brought from what’s now France, but those dates don’t work, either. The genetic makeup of Ireland has barely changed since the Bronze Age.
As Spinney writes: “The Scythians rode into Ukraine and India but left their language in neither. The Romans got as far as Britain, but Latin stayed (mostly) in France. Whoever carried Celtic to Ireland caused barely a tremor in the Irish gene pool.”
About 800BCE, the population of Ireland dropped drastically, she says, perhaps related to a sudden change in climate to drenching weather which brought crop failures or epidemics. They may, says archeologist Rowan McLaughlin, have been part of a “brain drain” to Britain, and their descendants may have returned a few centuries later speaking a new language: Celtic.
Another suggestion, by Dutch linguist Peter Schriver, is that the Beaker language survived in Britain until the first century BCE, when a number of British Celtic-speakers crossed to Ireland to escape the Roman invasion. Britain was thriving at this time, and perhaps the indigenous Irish people aspired to be like them, and came to adopt their language. But their Celtic would have been spoken with a “Beaker” accent – coming out as what would become Old Irish.

Where did Sanskrit come from?
Sanskrit is the ancient Indo-European language of India: but the debate about where it came from is highly politicised, says Laura Spinney.
A climate crisis around 2200BCE affected not just Europe but the entire planet. Many regions were hit by drought, famine and plague. The Akkadian and Sumerian empires of Mesopotamia evaporated. Much migration occurred, and some headed back east. The Sintasha culture, in the southern Urals, is thought to be an eastern migration of Corded Ware.
No serious scholar denies there was a migration into India around this time, but did it bring an early form of Sanskrit and the beliefs that shaped the Vedas at the foundation of Hinduism? Steppe theorists say it did, and that the Indo-Iranian languages came from Andronovo in Siberia to Sintasha. Some think the Indo-Iranian languages crossed the Iranian plateau 1000 years earlier and Sanskrit was already spoken in India.
A less popular theory is that Sanskrit originated in India and reverses the Indo-European language flow. It has become dogma for some Hindu nationalists.
There is now evidence that immigrants reached northern India around 1600BCE.
Sanskrit probably died as a vernacular some time in the first millennium CE, says Spinney, having split into about a dozen daughter languages, including Bengali, Marathi, Urdu and Hindi.
Groups of Indians who speak languages descended from Sanskrit carry more steppe ancestry than those who speak non-Indo-European languages. Harvard geneticist David Reich has said, “Almost everyone in India is a mixture of two highly divergent ancestral populations, one of which derived about half its ancestry from the Yamnaya.”