The origin of the Finnic peoples

One of the very first things I wrote about in relation to historical population genetics was in on the origins of the Finnic peoples. The reasons are two fold:

– first, the Finns and Estonians speak language is rather peculiar in a Europe dominated by Indo-European tongues (I suspect one reason that Tolkien based Quenya, the high elvish language, on Finnish is that it is so otherworldy to the Germanic ear. The Sindarin language, which was the common tongue of elves in Middle Earth, was based on Welsh). Rather, the distribution to the Uralic languages extends to the east, as far as Siberia. Even the closest affinities to Finnish and Estonian extend eastward, as there are Karelians who live deep in northwest Russia.

– second, there were peculiarities in the genetics of the Finns which date back to the 20th century that have always been notable.

Some of the distinctiveness of the Finns clearly has to do with the demographic isolation of the recent past, and the range expansion into the north and east. I will ignore this aspect of recent drift, and focus on their deep history and phylogenetic relationships.

New molecular genetic techniques in the 1980s and 1990s which enabled the genotyping of Y and mtDNA lineages immediately yielded the fact that the paternal heritage of the Finns is very unique in comparison to their neighbors, and erstwhile hegemons, the Scandinavians. While Swedes tend to be haplogroup I (indigenous to Western Europe dating to the late Pleistocene) or one of the two R1 lineages (intrusive from the Eurasian steppe during the Bronze Age), Finns tend to be haplogroup N3, with a substantial minority of I. While 63 percent of Finns are N3, only 3 percent of Swedens are. Due through the reality of migration of Finns to Sweden, as well as the prevalence of Saami all across Northern Sweden until the early modern period, Swedish N3 may be due to gene flow in the last thousand years. The two R1 lineages are ~10% of the Finnish paternal gene pool, they’re strongly skewed toward R1a, while the ~40% of Swedish R1 lineages are balanced.

In contrast the mtDNA profiles of Finns are very similar to their neighbors. Like Sweden the dominant haplogroup is a branch of H, with the reduced fraction accountable for the fact that Finns have a higher percentage of U5, which has been associated with European hunter-gatherers. The various haplogroups (e.g., T) associated with Early European Farmers are at somewhat lower frequency in Finland than Sweden.

A simple explanation then presents itself to us: the Finns have been subjected to male mediated admixture into a “conventional” European substrate. But there has been long been controversy as to whether the Finnish N3 haplogroup was indigenous to Europe, or its presence in Northeast Europe was due to migration. If it was indigenous than the admixture model does not make as much sense. But as with many things we’ve moved very far in comparison to where we were when I first began to look at this issue in 2002.

If you read Human Y Chromosome Haplogroup N: A Non-trivial Time-Resolved Phylogeography that Cuts across Language Families the likelihood than the Y chromsomal structure of Finland is old seems low. First, Finnish N3 lineages are very young and underwent rapid expansion beginning 4 to 6 thousand years ago (this is evident in their whole genome variation pattern). Second, the most diversity of N seems to be in Western Siberia. Third, N exists in higher frequencies in parts of Siberia than even in Finland. Fourth, the range of N pushes it all the way to the Pacific Ocean. It is not implausible that it expanded from one rim of Eurasia to the other, but the most likely scenario is that it came from somewhere in the middle.

Also, it is likely that there has been admixture into Finns from an East Eurasian population. To give some examples, a derived SNP at EDAR is at very high frequency in Northeast Asians. The ancestral variant is dominant outside of East Asia and the New World. In Europe among modern Europeans the derived variant of EDAR is not present in indigenous populations. A quick check in the 10000 Genomes data shows that it’s at ~6% in Finns (in contrast, the ancestral variant of SLC24A5 is present at frequencies of ~1; this could be random, but I suspect in situ selection….). You can see that the derived variant is absent in a rather large sampling of other Europeans.

Running ADMIXTURE unsupervised it’s immediately obvious that Finnic peoples have a minority component of East Eurasian admixture. This dark blue element is absent in most of the Swedes. Not surprisingly the Russians exhibit structure depending on where you sample. Some Russian populations are clearly Slavicized relatively recently, and exhibit a genetic profile rather like Finnic peoples (this northern Russian regions also have high frequencies of haplogroup N, which is much rarer in the south or among Ukrainians).

There’s a cline that runs east to west in relation to this component. The Finn’s neighbors immediately to the east, Karelians and Veps, have a higher fraction than the Finns proper. Additionally, some Finns in the data seem to lack it totally. One might speculate that these are people of Swedish origin who eventually assimilated to the Finnish identity. This is not impossible. In the 19th century Finnish nationalism was sparked in large part by middle class activists, many of whom were Swedish ethno-linguistically due to the connections between class and language at that time. But these individuals may be evidence of older structure in Finland. More on that later.

I also ran some Treemix on a subset of the data. You see there is gene flow coming into the Finns from a Siberian group. I used Nenets (a group of Samoyeds) and Yakut because the former have more linguistically in common with the Finns, while the latter are used by companies like 23andMe (Yakuts are the most northeasterly Turkic people). Strangely the Karelians and Veps get gene flow from Nenets, while the Finns get it from Yakuts (I pruned with PCA and ADMIXTURE to remove individuals with recent European ancestry).

But the model of a single pulse admixture is probably wrong anyhow. Rather, the spread of Finnic hunters and gatherers may have gradual, and/or occurred in several pulses. On the fringe of Northern Eurasia local extinctions were probably common. The landscape of Northern Eurasia, from the Baltic to Siberia, may long have been rather dynamic, with interactions between Uralic, Indo-European and Altaic peoples.

At this point I am at a loss. The archaeology of Finland is not something I know well, and the academic literature is hard for me to track down. Some scholars believe that the Comb Ceramic Culture plays a major role in the ethnogenesis of the people we call Finns. During the Bronze Age the Corded Ware zone spread into southern Finland, bringing agriculture. The fusion between the Comb Ceramic and Corded Ware led up to the societies which are first mentioned by Classical authors.

Finland was always liminal to early agriculture, and the Corded Ware Indo-Europeans may eventually have given away to the forest Finns as the climate turned more difficult. The predominance of N3 haplogroups may be a function of the nature of patriarchal societies, where certain lineages maintain powerful long term advantages.