Certain mutations in our DNA sequences can cause proteins to assemble differently than they should. While many of these mutations are harmless, others can cause serious diseases. However, for most variables, their effects remain unclear. Now Google’s DeepMind has published AlphaMissense, which uses artificial intelligence to calculate the structure of modified proteins and predict potentially harmful mutations. This can help reveal the causes of rare genetic diseases.
The DNA sequence of our genome provides the blueprint for our proteins. Mutations can change individual DNA building blocks. In the case of so-called missense mutations, such a change in the scheme leads to the incorporation of a different amino acid into the affected protein. “Of the more than four million missense variants observed, only an estimated 2% are clinically classified as pathogenic or benign, while the vast majority are of unknown clinical significance,” explains a team led by John Cheng of Google DeepMind in London. “. . “This limits the diagnosis of rare diseases and the development or use of clinical treatments that target the underlying genetic cause.”
Further development of AlphaFold
However, it has so far been difficult to predict the potential effects of the faulty version. New technologies allow thousands of variable effects to be recorded simultaneously using cell cultures and DNA sequencing. But the results of such experiments are currently available for only a small portion of the human genome. So John Cheng and his team chose a different approach that could be used on a much larger scale. They used the artificial intelligence program AlphaFold, developed by DeepMind a few years ago, as a basis, which uses protein sequences to predict exactly how the corresponding protein will fold, which structure it will adopt.
“We adapted AlphaFold to predict the pathogenicity of missense variants,” Cheng and colleagues wrote. To do this, they combined AlphaFold structure predictions with information from clinical databases that contain information about already known missense mutations and their effects. It also included the frequency of some variants in humans. “Machine learning can be used to identify and exploit patterns in biological data to infer the effect of previously unexplored variables,” the authors explain.
Pathological or benign?
The researchers calculated all possible variants of missense mutations for nearly 20,000 human proteins, for a total of 216 million possible changes in individual amino acids. This resulted in 71 million predictions with wrong variables. “Using AlphaMissense, we classified 32% of these missense variants as pathogenic and 57% as likely benign,” the team says. AlphaMissense did not provide a clear rating for 11 percent of the variables.
More detailed analyzes showed that variants in proteins that changed little over the course of evolution were more often classified as disease-causing, as were variants that affect the stability of proteins. Comparisons of results with scientifically studied missense mutations have already shown a high degree of agreement between AlphaMissense’s predictions and actually observed effects. “AlphaMissense predictions have the potential to accelerate our understanding of the molecular effects of variants on protein function, contribute to the discovery of disease-causing genes, and increase the diagnostic yield of rare genetic diseases,” John Cheng and his team wrote.
So far it is limited only to diagnostic use
In an accompanying commentary, also published in Science, Joseph Marsh of the University of Edinburgh and Sarah Tishman of the University of Cambridge, who were not involved in the study, write that AlphaMissense represents the beginning of a new phase in forecasting variable impacts. However, they also point out that it is still unclear to what extent pure mathematical predictions can be relied upon when diagnosing diseases.
“Although AlphaMissense’s classifications of likely pathogens or potentially benign are undoubtedly useful in interpreting and prioritizing variants, these nomenclature should not be confused with the very specific clinical definitions of these terms, which rely on multiple lines of evidence,” they wrote. It should also be noted that the effects of mutations in practice are very complex. Even if a person carries a pathogenic mutation, it does not necessarily lead to disease. “But although we cannot currently rely solely on predictive models like AlphaMissense for genetic diagnosis, their utility will continue to increase in the future as both computational methods and strategies for interpreting them improve.”
Source: John Cheng (Google DeepMind, London, UK) et al., Science, doi: 10.1126/science.adg7492
“Alcohol buff. Troublemaker. Introvert. Student. Social media lover. Web ninja. Bacon fan. Reader.”