Fisherianism in the genomic era

Fisherianism in the genomic era

There are many things about R. A. Fisher that one could say. Professionally he was one of the founders of evolutionary genetics and statistics, and arguably the second greatest evolutionary biologist after Charles Darwin. With his work in the first few decades of the 20th century he reconciled the quantitative evolutionary framework of the school of biometry with mechanistic genetics, and formalized evolutionary theory in The Genetical Theory of Natural Selection.

He was also an asshole. This is clear in the major biography of him, R.A. Fisher: The Life of a Scientist. It was written by his daughter.  But The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century also seems to indicate he was a dick. And W. D. Hamilton’s Narrow Roads of Gene Land portrays Fisher has rather cold and distant, despite the fact that Hamilton idolized him.

Notwithstanding his unpleasant personality, R. A. Fisher seems to have been a veritable mentat in his early years. Much of his thinking crystallized in the first few decades of the 20th century, when genetics was a new science and mathematical methods were being brought to bear on a host of topics. It would be decades until DNA was understood to be the substrate of heredity. Instead of deriving from molecular first principles which were simply not known in that day, Fisher and his colleagues constructed a theoretical formal edifice which drew upon patterns of inheritance that were evident in lineages of organisms that they could observe around them (Fisher had a mouse colony which he utilized now and then to vent his anger by crushing mice with his bare hands). Upon that observational scaffold they placed a sturdy superstructure of mathematical formality. That edifice has been surprisingly robust down to the present day.

One of Fisher’s frameworks which still gives insight is the geometric model of the distribution of fitness of mutations. If an organism is near its optimum of fitness, than large jumps in any direction will reduce its fitness. In contrast, small jumps have some probability of getting closer to the optimum of fitness. In plainer language, mutations of large effect are bad, and mutations of small effect are not as bad.

A new paper in PNAS loops back to this framework, Determining the factors driving selective effects of new nonsynonymous mutations:

Our study addresses two fundamental questions regarding the effect of random mutations on fitness: First, do fitness effects differ between species when controlling for demographic effects? Second, what are the responsible biological factors? We show that amino acid-changing mutations in humans are, on average, more deleterious than mutations in Drosophila. We demonstrate that the only theoretical model that is fully consistent with our results is Fisher’s geometrical model. This result indicates that species complexity, as well as distance of the population to the fitness optimum, modulated by long-term population size, are the key drivers of the fitness effects of new amino acid mutations. Other factors, like protein stability and mutational robustness, do not play a dominant role.

In the title of the paper itself is something that would have been alien to Fisher’s understanding when he formulated his geometric model: the term “nonsynonymous” to refer to mutations which change the amino acid corresponding to the triplet codon. The paper is understandably larded with terminology from the post-DNA and post-genomic era, and yet comes to the conclusion that a nearly blind statistical geneticist from about a century ago correctly adduced the nature of mutation’s affects on fitness in organisms.

The authors focused on two primary species which different histories, but well characterized in the evolutionary genomic literature: humans and Drosophila. The models they tested are as follows:

 

Basically they checked the empirical distribution of the site frequency spectra (SFS) of the nonsynonymous variants against expected outcomes based on particular details of demographics, which were inferred from synonymous variation. Drosophila have effective population sizes orders of magnitude larger than humans, so if that is not taken into account, then the results will be off. There are also a bunch of simulations in the paper to check for robustness of their results, and they also caveat the conclusion with admissions that other models besides the Fisherian one may play some role in their focal species, and more in other taxa. A lot of this strikes me as accruing through the review process, and I don’t have the time to replicate all the details to confirm their results, though I hope some of the reviewers did so (again, I suspect that the reviewers were demanding some of these checks, so they definitely should have in my opinion).

In the Fisherian model more complex organisms are more fine-tuned due topleiotropy and other such dynamics. So new mutations are more likely to deviate away from the optimum. This is the major finding that they confirmed. What does “complex” mean? The Drosophila genome is less than 10% of the human genome’s size, but the migratory locust has twice as large a genome as humans, while wheat has a sequence more than five times as large. But organism to organism, it does seem that Drosophila has less complexity than humans. And they checked with other organisms besides their two focal ones…though the genomes there are not as complete presumably.

As I indicated above, the authors believe they’ve checked for factors such as background selection, which may confound selection coefficients on specific mutations. The paper is interesting as much for the fact that it illustrates how powerful analytic techniques developed in a pre-DNA era were. Some of the models above are mechanistic, and require a certain understanding of the nature of molecular processes. And yet they don’t seem as predictive as a more abstract framework!

Citation: Christian D. Huber, Bernard Y. Kim, Clare D. Marsden, and Kirk E. Lohmueller, Determining the factors driving selective effects of new nonsynonymous mutations PNAS 2017 ; published ahead of print April 11, 2017, doi:10.1073/pnas.1619508114

Razib Khan