Genetics is One: Mendelism and quantitative traits

quantgen

In the early 20th century there was a rather strange (in hindsight) debate between two groups of biological scientists attempting to understand the basis of inheritance and its relationship to evolutionary processes. The two factions were the biometricians and Mendelians. As indicated by their appellation the Mendelians were partisans of the model of inheritance formulated by Gregor Mendel. Like Mendel many of these individuals were experimentalists, with a rough & ready qualitative understanding of biological processes. William Bateson was arguably the model’s most vociferous promoter. Set against the Mendelians were more mathematically minded thinkers who viewed themselves as the true inheritors of the mantle of Charles Darwin. Though the grand old patron of the biometricians was Francis Galton, the greatest expositor of the school was Karl Pearson.* Pearson, along with the zoologist W. F. R. Weldon, defended Charles Darwin’s conception of evolution by natural selection during the darkest days of what Peter J. Bowler terms “The Eclipse of Darwinism”.** One aspect of Darwin’s theory as laid out in The Origin of Species was gradual change through the operation of natural selection upon extant genetic variation. There was a major problem with the model which Darwin proposed: he could offer no plausible engine in regards to mode of inheritance. Like many of his peers Charles Darwin implicitly assumed a blending model of inheritance, so that the offspring would be an analog constructed about the mean of the parental values. But as any old school boy knows the act of blending diminishes variation! This, along with other concerns, resulted in a general tendency in the late 19th century to accept the brilliance of the idea of evolution as descent with modification, but dismiss the motive engine which Charles Darwin proposed, gradual adaptation via natural selection upon heritable variation.

Mendels theory of inheritance rescued Darwinism from the problem of gradual diminution of natural selection’s raw material through the process of sexual reproduction. Yet due to personal and professional rivalries many did not see in Mendelism the salvation of evolutionary theory. Pearson and the biometricians scoffed at Bateson and company’s innumeracy. They also argued that the qualitative distinctions in trait value generated by Mendel’s model could not account for the wide range of continuous traits which were the bread & butter of biometrics, and therefore natural selection itself. Some of the Mendelians also engaged in their own flights of fancy, seeing in large effect mutations which they were generating in the laboratory an opening for the possibility of saltation, and rendering Darwinian gradualism absolutely moot.

There were great passions on both sides. The details are impeccably recounted in Will Provine’s The Origins of Theoretical Population Genetics. Early on in the great debates the statistician G. U. Yule showed how Mendelism could be reconciled with biometrics. But his arguments seem to have fallen on deaf ears. Over time the controversy abated as biometricians gave way to the Mendelians through a process of attrition. Weldon’s death in 1906 was arguably the clearest turning point, but it took a young mathematician to finish the game and fuse Mendelism and biometrics together and lay the seeds for a hybrid theoretical evolutionary genetics.

That young mathematician was R. A. Fisher. Fisher’s magnum opus is The Genetical Theory of Natural Setlection, and his debates with the American physiologist and geneticist Sewall Wright laid the groundwork for much of evolutionary biology in the 20th century. Along with J. B. S. Haldane they formed the three-legged population genetic stool upon which the Modern Neo-Darwinian Synthesis would come to rest. Not only was R. A. Fisher a giant within the field of evolutionary biology, but he was also one of the founders of modern statistics. But those accomplishments were of the future, first he had to reconcile Mendelism with the evolutionary biology which came down from Charles Darwin. He did so with such finality that the last embers of the debate were finally doused, and the proponents of Mendelism no longer needed to be doubters of Darwin, and the devotees of Darwin no longer needed to see in the new genetics a threat to their own theory.

One of the major issues at work in the earlier controversies was one of methodological and cognitive incomprehension. William Bateson was a well known mathematical incompetent, and he could not follow the arguments of the biometricians because of their quantitative character. But no matter, he viewed it all as sophistry meant to obscure, not illuminate, and his knowledge of concrete variation in form and the patterns of inheritance suggested that Mendelism was correct. The coterie around Karl Pearson may have slowly been withering, but the powerful tools which the biometricians had pioneered were just waiting to be integrated into a Mendelian framework by the right person. By 1911 R. A. Fisher believed he had done so, though he did not write the paper until 1916, and it was published only in 1918. Titled The Correlation Between Relatives on the Supposition of Mendelian Inheritance, it was dense, and often cryptic in the details. But the title itself is a pointer as to its aim, correlation being a statistical concept pioneered by Francis Galton, and the supposition of Mendelian inheritance being the model he wished to reconcile with classical Darwinism in the biometric tradition. And in this project Fisher had a backer with an unimpeachable pedigree: a son of Charles Darwin himself, Leonard Darwin.

You can find this seminal paper online, at the R. A. Fisher digital archive. Here is the penultimate paragraph:

In general, the hypothesis of cumulative Mendelian factors seems to fit the facts very accurately. The only marked discrepancy from existing published work lies in the correlation for first cousins. Snow, owning apparently to an error, would make this as high as an avuncular correlation; in our opinion it should differ by little from that of the great-grandparent. The values found by Miss Elderton are certainly extremely high, but until we have a record of complete cousinships measured accurately and without selection, it will not be possible to obtain satisfactory numerical evidence on this question. As with cousins, so we may hope that more extensive measurements will gradually lead to values for the other relationship correlations with smaller standard errors. Especially would more accurate determinations of the fraternal correlation make our conclusions more exact.

I have to admit at the best of times that R. A. Fisher can be a difficult prose stylist to follow. One might wish to add from a contemporary vantage point that his language has a quaint and dated feel which compounds the confusion, but the historical record is clear that contemporaries had great difficulty in teasing apart distinct elements in his argument. Much of this was due to the mathematical aspect of his thinking, most biologists were simply not equipped to follow it (as late as the 1950s biologists at Oxford were dismissing Fisher’s work as that of a misguided mathematician according to W. D. Hamilton). In the the text of this paper there are the classic jumps and mysterious connections between equations along the chain of derivation which characterize much of mathematics. The problem was particularly acute with Fisher because his thoughts were rather deep and fundamental, and he could hold a great deal of complexity in his mind. Finally, there are extensive tables and computations of correlations of pedigrees from that period drawn from biometric research which seem extraneous to us today, especially if you have Mathematica handy.

But the logic behind The Correlation Between Relatives on the Supposition of Mendelian Inheritance is rather simple: in the patterns of correlations betweens relatives, and the nature of variance in trait value across those relatives, one could perceive the nature of Mendelian inheritance. It was Mendelian inheritance which could explain most easily the patterns of variation across continuous traits as they were passed down from parent to offspring, and as they manifested across a pedigree. Early on in the paper Fisher observes that a measured correlation between father and son in stature is 0.5. From this one can explain 1/4 of the variance in the height across the set of possible sons. This biological relationship is just a specific instance of the coefficient of determination, how much of the variance in a value, Y (sons’ heights), you can predict from the variance in X (fathers’ heights). Correcting for sex one can do the same for mothers and their sons (and inversely, fathers and their daughters).*** So combing the correlations of the parents to their offspring you can explain about half of the variance in the offspring height in this example (the correlation is higher in contemporary populations, probably because of much better nutrition in the lower orders). But you need not constraint yourself to parent-child correlations. Fisher shows that correlations across many sorts of relationships (e.g., grandparent-grandchild, sibling-sibling, uncle-niece/nephew) have predictive value, though the correlation will be a function of genetic distance.

What does correlation, a statistical value, have to do with Mendelism? Remember, Fisher argues that it is Mendelism which can explain in the details patterns of correlations on continuous traits. There were peculiarities in the data which biometricians explained with abstruse and ornate models which do not bear repeating, so implausible were the chain of conjectures. It turns out that Mendelism is not only the correct explanation for inheritance, but it is elegant and parsimonious when set next to the alternatives proposed which had equivalent explanatory power. A simple blending model could not explain the complexity of life’s variation, so more complex blending models emerged. But it turned out that a simple Mendelian model explained complexity just as well, and so the epicycles of the biometricians came crashing down. Mendelism was for evolutionary biology what the Copernican model was for planetary astronomy.

To a specific case where Mendelism is handy: in the data Fisher noted that the height of a sibling can explain 54% of the variance of height of other siblings, while the height of parents can explain only 40% of that of their offspring. Why the discrepancy? It is noted in the paper that the difference between identical twins is marginal, and other workers had suggested that the impact of environment could not explain the whole residual (what remains after the genetic component). Though later researchers observe that Fisher’s assumptions here were too strong (or at least the state of the data on human inheritance at the time misled him) the big picture is that siblings have a component of genetic correlation which they share with each other which they do not share with their parents, and that is the fraction accounted for by dominance. When dominance is included in the equation heritability is referred to as the “broad sense,” while when dominance is removed it is termed “narrow sense.”

A concept such as dominance can of course be easily explained by Mendelism, at least formally (the physiological basis of dominance was later a point of contention between Fisher and Sewall Wright). Most of you have seen a Punnet square, whereby heterozygous parents will produce offspring in ratios where 50% are heterozygous, and 25% one homozygote and 25% another. But consider a scenario where one parent is a heterozygote, and the other a homozygote for the dominant trait. Both parents will express the same trait value, as will their offsprings. But, there will be a decoupling of the correlation between trait-value and genotype here, as the offspring will be genotypically variant. Parent-offspring correlations along the regression line become distorted by a dominance parameter, and so reduce correlations. In contrast, full siblings share the same dominance effects because they share the same parents and can potentially receive the same identical by descent alleles twice. Consider a rare recessively expressed allele, one for cystic fibrosis. As it is rare in a population in almost all cases where the offspring are homozygotes for the disease causing allele, both parents will be heterozygotes. They will not express the disease because of its recessive character. But 25% of their offspring may because of the nature of Mendelian inheritance. So there’s a major possible disjunction between trait values from the parental to offspring cohorts. On the other hand, each sibling has a 25% chance of expressing the disease, and so the correlation is much higher than that with the parents (who do not express disease). In other words siblings can resemble each other much more than they may resemble either parent! This makes intuitive sense when you consider the inheritance constraints and features of Mendelism in diploid sexual species. But obviously a simple blending model can account for this. What it can not account for is the persistence of variation. It is through the segregation of independent Mendelian alleles, and their discrete and independent reassortment, that one can see how variation would not only persist from generation to generation, but manifest within families as alleles across loci shake out in different combinations. A simple model of inheritance can then explain two specific phenomena which are very different from each other.

There is much in Fisher’s paper which prefigures later work, and much which is rooted in somewhat shaky pedigrees and biometric research of his day. The take home is that Fisher starts from an a priori Mendelian model, and shows how it could cascade down the chain of inferences and produce the continuous quantitative characteristics we see all around us. From the Hardy-Weinberg principle he drills down through the inexorable layers of logic to generate the formalisms which we associate with heritability, thick with variance terms. The Correlation Between Relatives on the Supposition of Mendelian Inheritance was a marriage between what was biometrics and Mendelism which eventually gave rise to population genetics, and forced the truce between the seeds of that domain and what became quantitative genetics.

As I said, the paper itself is dense, often opaque, and characterized by a prose style that lends itself to exegesis. But I find that it is often useful to see the deep logics behind evolution and genetics laid bare. Some of the issues which we grapple with today in the “post-genomic era” have their intellectual roots in this period, and Fisher’s work which showed that quantitative continuous traits and discrete Mendelian characters were one in the same. The “missing heritability” hinges on the fact that classical statistical techniques tell us that Mendelian inheritance is responsible for the variation of many traits, but modern statistical biology which has recourse to the latest sequencing technology has still not be able to crack that particular nut with satisfaction. Perhaps decades from now biologists will look at the “missing heritability” debate and laugh at the blindness of current researchers, when the answer was right under their noses. Alas, I suspect that we live in the age of Big Science, and a lone genius is unlikely to solve the riddle on his lonesome.

Citation: Fisher, R. A. (1918). On the correlation between relatives on the supposition of Mendelian inheritance Transactions of the Royal Society of Edinburgh

* Though I will spare you the details, it may be that the Galtonians were by and large more Galtonian than Galton himself! It seems that Francis Galton was partial was William Bateson’s Mendelian model.

** To be fair, I believe the phrase was originally coined by Julian Huxely.

*** Just use standard deviation units.

Image Credit: Wikimedia