A few weeks ago people in the comments were nagging me a bit about some new papers on the haplogroup R1a1. This Y chromosomal lineage is found at very high frequencies from East-Central Europe into India. Initially, researchers such as Spencer Wells assumed that R1a1 signaled the arrival of Indo-Aryans to the Indian subcontinent, its frequencies decline in a northwest-to-southeast gradient, and from high to low castes. In Europe the modal frequencies are among Slavic groups, with a high representation among Germanic-speakers. The frequency of R1a1 declines sharply in Western and Southern Europe. It is very common in Central Asia as well as eastern Iran and Afghanistan. One parsimonious explanation would be that R1a1 spread with Kurgan males, along with Indo-European languages, on the order of 4-5,000 years ago.
There is a problem with this model though. One of the new papers reiterates the finding that the coalescence of the European and South Asian lineages is on the order of 10,000 years ago: Separating the post-Glacial coancestry of European and Asian Y chromosomes within haplogroup R1a (R1a1 is the dominant clade within R1a). A second paper reports the finding that R1a1 is very diverse in India, indicating deep time depth: The Indian origin of paternal haplogroup R1a1* substantiates the autochthonous origin of Brahmins and the caste system. For both R1a1 &”Ancestral North Indians” (ANI) in Reich et al.: the frequency seems intuitively way too high among tribal populations, even in South India. Remember that the low bound for ANI was ~40%. R1a1 is found at frequencies as high as 25% or so among some South Indian tribals. If this lineage arrived with the Indo-Aryans it is peculiar that it is found in such high frequencies in populations which were marginal and isolated from the dominant non-Indo-Aryan populations of South India. Back to Europe, here is a section from the abstract of the first paper:
Conversely, marker M458 has a significant frequency in Europe, exceeding 30% in its core area in Eastern Europe and comprising up to 70% of all M17 chromosomes present there. The diversity and frequency profiles of M458 suggest its origin during the early Holocene and a subsequent expansion likely related to a number of prehistoric cultural developments in the region. Its primary frequency and diversity distribution correlates well with some of the major Central and East European river basins where settled farming was established before its spread further eastward. Importantly, the virtual absence of M458 chromosomes outside Europe speaks against substantial patrilineal gene flow from East Europe to Asia, including to India, at least since the mid-Holocene.
The Holocene started 11,700 years ago. We are living in the Holocene. So the means that gene flow can’t be any later than 6,000 years ago. The paper which focuses specifically on Indian lineages reports a coalescence time on the order of 10,000 years in the past for South Asian R1a1 branches. Additionally, they confirm earlier findings that of caste ranking of R1a1 in terms of frequency, as well Brahmins having the most diversity of all groups in terms of haplotypes (ergo, the title of the paper).
Both Dienekes and Polish Genetics and Anthropology suggest that the calibration is wrong on these coalescence times. They argue that one should reduce the time to a common ancestor by a factor of 3. This would of course make a huge difference. In regards to the Reich et al. paper which argued for a plausible two-way admixture between ANI and “Ancestral South Indians” (ASI), the linkage disequilibrium has decayed too much from the time of admixture to peg a date. This was a method used to calculate the emergence of the Uyghurs as a hybrid population, on the order of 2-3 thousand years ago (admixture between two very different populations generates linkage disequilibrium which decays over time due to recombination). In terms of Fst the ANI have a value in relation to Northern Europeans which is about 3 times larger than the mean between population differences in Europe. This is somewhat greater than the pairwise values between any European populations except for the Baltic peoples (in particularly, the swath from Karelia to Lithuania) to the groups of Southern Europe. The degree of Neolithic Middle Eastern ancestry within Europe under debate, but I think one can assume that Southern Italians and Karelians are likely at opposite ends in terms of frequency of this contribution to the pre-Ice Age demographic substratum of Europe. From this I offer that it is not totally unreasonable to posit that the ANI contribution to South Asian ancestry was closer to the margins of the last Ice Age, rather than the period of the Indo-European expansion, and that its Fst values are not unreasonable in relation to modern European groups.
The main issue that is confusing is the diversity of R1a1 in South Asia. A first order model going from just this data would be that R1a1 derives from India, and spread to the Eurasian plain. But Reich et al. show data that imply little likelihood of South Asian contribution to European ancestry. The only possibility would be if ANI and ASI were totally separated when a branch of ANI left South Asia for the Eurasian plain, and which point the process of admixture between ANI and ASI began. Another possibility is that the distribution of R1a1 in Eurasia is a palimpsest. Recent work in ancient DNA is suggesting that inferring past distributions from contemporary ones may lead us astray. It could be that R1a1 was once far more diverse in Europe and Central Asia, but that subsequent demographic events eliminated most of that diversity, while such events did not occur in Europe. Y chromosomal lineages may be particularly likely to be wiped out by the expansion of new tribes as old elites are killed or marginalized. The current distribution of a particular branch of R1a1 in Europe, associated in particular with Slavs, may be an expansion of the lineage which managed to survive elimination at some point in the mid-Holocene.
Though do note I put little weight in my speculations. It seems rather confusing. But since I was asked….