A Step Beyond the “Black Box” in Genomic Studies
In recent genomic research, many statistical models function as “black boxes,” providing accurate predictions but lacking transparency in their internal mechanisms. These opaque processes, especially in the era of deep learning, often make it challenging for scientists to interpret how specific predictions are reached. Addressing this gap, Natália Ružičková, a physicist and PhD candidate at the Institute of Science and Technology Austria (ISTA), collaborated with ISTA graduate Michal Hledík and Professor Gašper Tkačik to introduce a model designed to analyze polygenic diseases.
Unlike monogenic diseases, polygenic diseases arise from numerous small genetic variations spread across the genome. This new model, published in the Proceedings of the National Academy of Sciences (PNAS), combines statistical rigor with biological insights to offer a clearer understanding of genetic influences on disease.
From Genome-Wide Studies to Polygenic Complexity
The Human Genome Project, completed in 2003, revolutionized the study of human genetics by providing a comprehensive map of DNA, igniting hopes of understanding diseases linked to genetic mutations. However, this achievement led to the realization that many diseases result from multiple genetic factors rather than isolated mutations. Genome-wide association Studies (GWAS) emerged to identify genetic variations associated with traits and diseases.
By comparing the DNA of healthy and sick individuals, GWAS revealed patterns linking specific genetic mutations to conditions like type 2 diabetes, where hundreds of mutations interact to influence disease susceptibility. However, these interactions are complex; individually, each mutation contributes minimally, but collectively, they shape disease development. This complexity underscored the need for models that could explain these interconnected genetic influences.
The Quantitative Omnigenic Model and Yeast Experiments
In 2017, the omnigenic model was proposed, suggesting that genes do not act in isolation but rather through a network of regulatory interactions that amplify the effects of individual mutations. Building on this, Ružičková and her colleagues introduced the “quantitative omnigenic model” (QOM), a mathematical framework that illustrates how a mutation in one gene can influence others through these networks. To validate this model, the team applied it to yeast (Saccharomyces cerevisiae), a well-understood single-cell organism often used in biological research. Using QOM, they predicted gene expression levels, or how active a gene is, and tracked how mutations propagated through yeast’s regulatory network. Their model successfully pinpointed relevant genes and highlighted specific mutations likely responsible for certain outcomes.
Unlike standard GWAS models, which function as black boxes and rely on statistical associations, the QOM provides a causal explanation of how specific mutations might lead to disease. By revealing how mutations affect disease pathways, the model could significantly impact medicine, potentially aiding in the development of therapies for polygenic diseases. Although the model is in its early stages, Ružičková envisions future applications in human genetics, stating, “This is just the first step and proof of principle. Now that we understand what is possible, one can start thinking about applications to human genetics.”