No studies necessary: do your own replication!

No studies necessary: do your own replication!

In response to the post below I received the above response on twitter. This is an interesting case. The link goes to a paper in the year 2000, Alu insertion polymorphisms in NW Africa and the Iberian Peninsula: evidence for a strong genetic boundary through the Gibraltar Straits:

An analysis of 11 Alu insertion polymorphisms (ACE, TPA25, PV92, APO, FXIIIB, D1, A25, B65, HS2.43, HS3.23, and HS4.65) has been performed in several NW African (Northern, Western, and Southeastern Moroccans; Saharawi; Algerians; Tunisians) and Iberian (Basques, Catalans, and Andalusians) populations. Genetic distances and principal component analyses show a clear differentiation of NW African and Iberian groups of samples, suggesting a strong genetic barrier matching the geographical Mediterranean Sea barrier. The restriction to gene flow may be attributed to the navigational hazards across the Straits, but cultural factors must also have played a role. Some degree of gene flow from sub-Saharan Africa can be detected in the southern part of North Africa and in Saharawi and Southeastern Moroccans, as a result of a continuous gene flow across the Sahara desert that has created a south-north cline of sub-Saharan Africa influence in North Africa. Iberian samples show a substantial degree of homogeneity and fall within the cluster of European-based genetic diversity.


There are two issues at work here. A minor one is that I don’t necessarily disagree with the above study. The dense SNP-chip analyses do indicate a major division at the Mediterranean, and a preponderance of the ancestry of modern Iberians pre-dates the Moorish period. My only contention is that on the order of ~10 percent does seem to possibly derive from an influx of Berbers and Arabs, and this is far higher proportionally than the current fashion among historians, who tend to hold that migrations in the post-Roman world were of very small numbers, if they were demographic at all. My own suspicion is that what we’re seeing is partly reproductive variance by class. A small number of elite lineages may have a large demographic impact over the long term.

But there’s a bigger point, and that’s that of open science and what “counts” as evidence in a given argument. Frankly, an 11 marker study from the year 2000 holds less sway for me than analyses of hundreds of thousands of SNPs today. Just because Dienekes doesn’t publish in journals doesn’t mean that it’s not worth citing seriously. In fact, some of the posts that he puts out are more meaty than what you might find in some journals! But there’s a bigger issue: I’ve seen the same patterns as Dienekes in many cases, and that’s why I believe his results have some validity. In other words, I’ve replicated some of his findings. So has Zack and David. Sometimes there is disagreement. For many analyses and inspections of population structure you don’t need to look at the academic literature. That’s why I’ve put up a simple tutorial with scripts, how to use them, and a modest data set to go along. The academics are essential for proposing more powerful analytic techniques, subtle interpretations, as well as obtaining data sets (e.g., ancient DNA) which others would not be able to get a hold of. But the number of data sets in the public domain are so numerous one shouldn’t be citing something from even a few years ago at this point. The History and Geography of Human Genes is a great book, but you have more ability to crunch and analyze the underlying populations today in one day than the author of that book had for decades before its being published!

Razib Khan