The power of false positives in behavior genetics

The power of false positives in behavior genetics

Genomes Unzipped, Size matters, and other lessons from medical genetics:

Smaller studies, which had no power to detect these small effects, were essentially random p-value generators. Sometimes the p-values were “significant” and sometimes not, without any correlation to whether a variant was truly associated. Additionally, since investigators were often looking at only a few variants (often just one!) in a single gene that they strongly believed to be involved in the disease, they were often able to subset the data (splitting males and females, for example) to find “significant” results in some subgroup. This, combined with a tendency to publish positive results and leave negative results in a desk drawer, resulted in a conflicted and confusing body of literature which actively retarded medical genetics progress.

An easy thing to pick on is the reliance on “p-values,” thresholds of statistical significance. Just because something is statistically significant doesn’t mean that it is substance. Statistical significance is just a number, and blindly adhering to a numerical standard in most human endeavors often results in a creeping bias and “gaming” of the measurement. There’s going to be a random distribution of p-values, and for publication you just need to fish in the pool below the 0.05 threshold. It just goes to show that you can’t beat taking a step back, and actually thinking about what your results mean and how you came to them.

(as indicated in the post, this is a problem in many domains, probably most worryingly in medical and pharmaceutical studies)

Razib Khan