After posting on Basque mtDNA I wanted to make something more explicit that I alluded to below, that uniparental lineages are highly informative, but they may not be representative of total genome content. This is plainly true in the case of mestizos from Latin America, but we don’t need genetics to point us in the right direction on this score, we have plenty of textual evidence for asymmetry in sexes when it came to admixture events in the post-Columbian era. Rather, I want to note again the issue of South Asia. When it comes to mtDNA the good majority of South Asian lineages are closer to those of East Asia than Western Eurasia. By this, I do not mean to say that that they’re particular close to East Asian lineages, only that if you go back in the phylogeny the South Asian lineages (I’m thinking here of haplogroup M) they tend to coalesce first with East Asian lineages before they do so with West Eurasian lineages.
Here is a quote from one of the definitive papers on this topic:
Broadly, the average proportion of mtDNAs from West Eurasia among Indian caste populations is 17% (Table 2). In the western States of India and in Pakistan their share is greater, reaching over 30% in Kashmir and Gujarat, nearly 40% in Indian Punjab, and peaking, expectedly, at approximately 50% in Pakistan (Table 11, see Additional file 6, Figure 11, panel A). These frequencies demonstrate a general decline (SAA p < 0.05 Figure 4) towards the south (23%, 11% and 15% in Maharashtra, Kerala and Sri Lanka, respectively) and even more so towards the east of India (13% in Uttar Pradesh and around 7% in West Bengal and Bangladesh).
In Iran, over 90 percent of the mtDNA lineages seem West Eurasian. Though I accepted these findings, I was always a bit concerned that the 40 unit chasm between Iran and Pakistan was so large. Additionally, the autosomal studies seem to show that Pakistani populations exhibited affinities to West Eurasians greater than than would be predicted by being ~50 percent West Eurasian. And, as many of you no doubt know the mtDNA does not align well with the Y chromosomal lineages, which seem to indicate a stronger affinity to West Eurasia.
The 2009 paper Reconstructing Indian History resolved some of these confusions. In it the authors inferred that South Asians were a compound population, about ~50 percent West Eurasian, and ~50 percent South Eurasian, with this latter component having distant, but still closer, affinities to East Asians. In other words, the latter component could be easily aligned with the mtDNA, while the former made sense of the Y chromosomal lineages. According to the above paper the West Eurasian component was present at 70-80 percent fractions in Pakistan at the total genome level. This is considerably above the 50 percent for mtDNA, and made more sense of the visible affinities of Pakistanis to West Eurasians on the phenotypic dimension. But look at the rapid drop off mtDNA fraction.
Here’s a table I generated combining the drop off in ANI and mtDNA across the two papers:
If you don’t know the geography of India, the West Eurasian mtDNA fraction falls off a cliff very quickly in Northwest India. In contrast, the autosomal ANI fraction drops, but not nearly as precipitously. The ratio between the two is 2:3 in Pakistan. In Bengal is 1:5, but it is already 1:4 in Uttar Pradesh, which is closer geographically to Pakistan than Bengal (though arguably more ecologically distinct from Pakistan, the linguistic dialects of Uttar Pradesh are far closer to those of Pakistan than of Bengal). I will let you develop your own the story in this case, as there’s obviously a lot there could be said speculatively. Rather, I simply wanted to illustrate the reality that the differences between patterns in uniparental lineages and autosomal DNA can tell you a great deal, despite their disagreements on occasion.
Finally, I want to end on a somewhat different note:
Elevated frequencies of haplogroups common in eastern Eurasia are observed in Bangladesh (17%) and Indian Kashmir (21%) and may be explained by admixture with the adjacent populations of Tibet and Myanmar (and possibly further east: from China and perhaps Thailand).
These proportions are both higher than anything in the autosomal DNA. My parents are both 10-15 percent Southeast Asian in ancestry. But I am willing to bet that they’re slightly on the high side even for Bangladeshis (going by geography). And as for Kashmiris, these populations do often show some East Asian admixture, but generally not so high as 20%. What explains this? I have posited that rather than being intrusive to Bengal, the East Asian populations (Munda?) may have been already present when Indo-Aryan speaking agriculturalists arrived. This could explain a sex bias in assimilation of these populations toward females. In general my rule of thumb is that later population arrivals are correlated with a male bias in ancestry.