Genetics of Maharashtra Deshastha Brahmin
A Maharashtra Deshastha Brahmin sent me his sample. He plots with the Maharashtra Kayastha. He’s much more like a South Indian Brahmin than a North Indian Brahmin. The Maharashtra Saraswat Brahmin seems more north shifted.
The Hui Muslims, two pulses of “western” ancestry 1,000 and 500 years ago, mostly male mediated
There are 20 million Hui people in China. These are traditionally Chinese-speaking Muslims. Though they are found in every region of China (and in the Chinese Diaspora), they are concentrated […]
Merry Christmas!
I got a sample from someone where one parent was a West Bengal Sagdop, and another parent a Baidya with family origins in East Bengal. One hypothesis that I’ve see is that Baidya are basically Brahmins who lost their caste. Genetically this does not seem to be the case. Bengali Brahmins shift considerably toward the … Continue reading Merry Christmas!
First AASI mtDNA genomes from Sri Lanka (2500 and 5500 BC)
The mitochondrial genomes of two Pre-historic Hunter Gatherers in Sri Lanka: Sri Lanka is an island in the Indian Ocean connected by the sea routes of the Western and Eastern worlds. Although settlements of anatomically modern humans date back to 48,000 years, to date there is no genetic information on pre-historic individuals in Sri Lanka. … Continue reading First AASI mtDNA genomes from Sri Lanka (2500 and 5500 BC)
The Todas are more like IVC people than anyone else
I noticed something interesting a few weeks ago in the supplements of the Genomes Asian 1000K paper. Look at where the Toda are on the PCA. Now look at the Indus Valley samples I have…. I don’t have access to the Toda samples. But there’s a lot of evidence that this is a very unique … Continue reading The Todas are more like IVC people than anyone else
The varieties of Brahmins (and others)
Sometimes people pass me data. Turns out Rajasthani Brahmins are quite different from UP Brahmins (more northwest-shifted). In this, they are like Pandits. In contrast, Bihar Babhans are just like UP Brahmins, who don’t seem to have much structure. Gujarati Brahmins are between South Indian Brahmins and North Indian Brahmins, and closer to the latter, … Continue reading The varieties of Brahmins (and others)
Adivasis are just like everyone else…sort of…but not
My previous post on Adivasis was not totally clear. So I’m going to try in shorter fragments and outline things so I’m more clear. I am not 100% correct with the model below (we’ll know more later), but this is my best current conception. 10,000 BC, end of the Ice Age, NW quadrant of the … Continue reading Adivasis are just like everyone else…sort of…but not
Bangladesh and West Bengal genetics
I got a few more samples with provenance. The Bengali Brahmins are shifte the way you would expect. The Bangladesh Kayastha (someone from a Hindu background) is in the cluster with generic Bangladeshis from Dhaka. The West Bengali Kayastha is far less East Asian. My current model right now is that the Kayasthas are basically … Continue reading Bangladesh and West Bengal genetics
Genetic distances across the world
There was some discussion online about variation among South Asians. I decided to compute a few pairwise Fst statistics (measures between population variation) with some South Asian, European and East Asian populations (along with Iranians). I plot them below in two graphs. Also I ran Treemix. I don’t have any major conclusion, just draw your … Continue reading Genetic distances across the world
Genetic distances across the world
There was some discussion online about variation among South Asians. I decided to compute a few pairwise Fst statistics (measures between population variation) with some South Asian, European and East Asian populations (along with Iranians). I plot them below in two graphs. Also I ran Treemix. I don’t have any major conclusion, just draw your … Continue reading Genetic distances across the world
The Anglo-Saxonization of England happened through a mass migration
The Anglo-Saxon migration and the formation of the early English gene pool: The history of the British Isles and Ireland is characterized by multiple periods of major cultural change, including […]
The southern arc papers
Since David has not posted, here they are… The genetic history of the Southern Arc: A bridge between West Asia and Europe: By sequencing 727 ancient individuals from the Southern […]
The genetic future is here
In the year 2000, there was one single human genome. In 2010 there were fewer than 100 human genomes (you could look them up in a spreadsheet!). Today there are likely 1,000,000 human genomes. Good luck cataloging them all. Outside of the purview of our species, there are now efforts to sequence every animal on earth. And the sequencing revolution has not just changed our understanding of DNA, it has opened up the world of RNA to us, allowing scientists to track and trace gene expression in minute detail. Genomics is “eating biology.”
Whereas there was once a tiny data pond, today a substantial lake is swelling into a massive ocean. This is why we built GenRAIT — to help transition the burgeoning ecosystem of 21st-century genetics into the new age of genomics. Data offers the potential for insight and discovery. Data on life’s code — the genome — can potentially transform the future of human health outcomes. This makes “data” more than just a buzzword, but the key to unlocking the potential for a better world. But the influx and quantity of genome data in our new era threaten to overwhelm the capability of scientists to manage, utilize, and harness it, making that reality we wish to come into being unreachable. We want to push beyond that impasse with GenRAIT, and unlock the potential future.
But what happens when the data is finally brought under control? Data without an end is without purpose. What might the genetic future look like? Why do we at GenRAIT care so much?
One generation ago sequencing one’s genome was “blue sky” science, whereas today it’s a consumer commodity. Companies like Nebula genomics provide 30x high-quality medical-grade sequencing to consumers for $300 or less. With the average cost of health insurance for a family more than $1,000 a month, the cost of sequencing one’s genome is trivial. And whereas buying a car or other consumer item means acquiring a depreciating good, as its value declines over time, your genome sequence becomes more valuable as more research is published on the relationship between genetics and disease.
The more data you have in the pool the more results and findings you can obtain. Thirty years ago detecting a genetic variant that might cause a disease required tracking an inbred pedigree for decades. It was a project only viable for a hospital research group. But science moves forward. Fifteen years ago geneticists began to perform “genome-wide associations” that looked for common variation — those genes commonly causing disease within the population. This is the sort of result a company like 23andMe provides.
But there is more to the genome than things that are known and common. Many illnesses are caused by variations within families and narrow local lineages. If common variants are known unknowns, these are unknown unknowns, and only whole-genome sequences can give us insight. We have the technology, but we lack execution. Every individual’s joint medical and genomic information could be powerful, but only in the context of population-wide analysis of subtle but cumulatively significant patterns. You can only perceive the trees if you can see the forest. The value of one sequence goes up by orders of magnitude when you analyze it in the context of one billion sequences.
As we go into the 21st century, genomics will help us do more than diagnose and evaluate retrospectively. It will be essential to cure, treat, and anticipate the future. An individual’s genome can give doctors a map of how to cater to an individualized healthcare plan. That same genome can be used to prescribe lifestyle changes to improve that person’s future well-being and increase their longevity, impacting morbidity and mortality. It can be used to conserve and save endangered species and help them evolve to better adapt to the present and future environments. We now can imagine a future that can be edited and revised because of new technology.
In 2012 CRISPR genetic-engineering technology took the biological world by storm (and yielded Jennifer Doudna and colleagues a Nobel Prize), making gene-editing available to the broad masses of researchers. Though recombinant DNA technology has been utilized by scientists since the 1970s, it was a form of genetic engineering that was expensive and difficult to execute. CRISPR democratized genetic engineering, opening up the possibility that gene-editing could be a bespoke process, offering up the possibility of curing millions of people with congenital illnesses. Diseases like cystic fibrosis will likely be cured in the next twenty years through gene-editing technology.
Nevertheless, to get to that stage, we need the right environment in place to allow scientists to extract valuable information, patterns, and insights out of the genomes they receive. Before one can write to the genome, one must read the genome. Before one can develop engineering applications, one must master physics. We are already in the genomic age, as sequencing costs keep crashing and new technologies are on the horizon. But the flood of data threatens to overwhelm our capacity to use it rationally, intelligently, and effectively. As the NIH states, “Our ability to sequence DNA has far outpaced our ability to decipher the information it contains.” We must do better because the well-being of hundreds of millions is on the line. We have the data necessary to usher in a better future for healthcare and precision medicine. Now we need to unlock it.
The Data Platform for the Genomic Revolution
Introducing GenRAIT to the post-genomic eraThe human genetic map became reality in the first two decades of the 21st century. This was the dream of a century of genetics, laboriously tracing pedigrees across families decade after decade. But the combin…
Thank God the British are working on South Asian genomics
The sequences of 150,119 genomes in the UK Biobank: We defined two other cohorts based on ancestry: African (XAF; n = 9,633; Extended Data Fig. 4) and South Asian (XSA; n = 9,252; Extended Data Fig. 5) (Fig. 3a–c). The 37,598 UKB individuals who do not belong to XBI, XAF or XSA were assigned to the cohort OTH (others). … Continue reading Thank God the British are working on South Asian genomics
The Toda are different
A new paper on Southwest Indian genetics highlights the Toda sample from Genomes Asia. People in the comments of this weblog have asserted this small southern tribe may have the most “Indus Valley Civilization” ancestry in the subcontinent. This is perhaps an exaggeration, but, looking at the admixture plots the Toda clearly have hardly any … Continue reading The Toda are different
Global 25 is good, but a minor issue
ArainGang, has posted a pretty interesting map of various ancestry components in the subcontinent by population. It’s pretty good, especially for the south and west of the subcontinent. But, there is something weird going on in the northeast: a lot of these populations have “Ancestral Indian” (Andamanese) ancestry but hardly anything else East Asian. This … Continue reading Global 25 is good, but a minor issue
Is ancient DNA a biased view?
Over at my Substack Iberia: Ancient Europe’s Edge of the Earth (part 1) – Unpacking prehistoric Spanish and Portuguese genetics elicited a comment from Walter Bodmer questioning the representative of […]
Nick Patterson responds to Feldman and Riskin’s NYRB piece
Nick Patterson has responded on his Substack to the NYRB piece Why Biology is not Destiny, which itself is an attack on Kathryn Paige Harden’s book The Genetic Lottery. Patterson […]
When Surya left Olga of the Birch Forest
In the recent film The Northman the protagonist, Amleth, has a romantic relationship with a woman, “Olga of the Birch Forest.” Amleth was a Viking who raided Kievan Rus, and […]