For the past 50 years, scientists had struggled to solve what's better known as the 'protein-fold problem' but London-based AI lab DeepMind has finally cracked the puzzle using Artificial Intelligence. "In a major scientific advance, the latest version of our AI system AlphaFold has been recognised as a solution to this grand challenge by the organisers of the biennial Critical Assessment of protein Structure Prediction (CASP)," DeepMind said in a statement.
DeepMind CEO Demis Hassabis said he was thrilled to announce the "first major breakthrough" in applying AI to a grand challenge in science. "AlphaFold has been validated as a solution to the 'protein folding problem' & we hope it will have a big impact on disease understanding and drug discovery," he said.
This breakthrough shows the impact AI can have on scientific discovery and its potential to dramatically accelerate progress in the most fundamental fields that shape and explain our world, says the company. Notably, a protein's shape is closely linked with its function, and the ability to predict this structure unlocks a greater understanding of what it does and how it works.
This discovery could help in developing treatments for many diseases or finding enzymes that break down industrial waste. "We have been stuck on this one problem - how do proteins fold up - for nearly 50 years. To see DeepMind produce a solution for this, having worked personally on this problem for so long and after so many stops and starts, wondering if we'd ever get there, is a very special moment," says Professor John Moult, Co-Founder and Chair of CASP, University of Maryland.
Thrilled to announce our first major breakthrough in applying AI to a grand challenge in science. #AlphaFold has been validated as a solution to the ‘protein folding problem’ & we hope it will have a big impact on disease understanding and drug discovery: https://t.co/P53t2TxVRa— Demis Hassabis (@demishassabis) November 30, 2020
These techniques, as well as newer methods like cryo-electron microscopy, depend on extensive trial and error, which can take years of painstaking and laborious work per structure, and require the use of multi-million dollar specialised equipment, explains the AI lab.
Meanwhile, the global tech and business leaders have hailed the technological breakthrough, saying it could revolutionise industries and help find solutions to several diseases.
"One of biology's biggest mysteries 'largely solved' by AI - this is awesome technology that will revolutionise drug development," says Biocon Executive Chairperson Kiran Mazumdar-Shaw.
One of biology's biggest mysteries 'largely solved' by AI - this is awesome technology that will revolutionise drug development https://t.co/fQQVfkwZ0Y— Kiran Mazumdar-Shaw (@kiranshaw) December 1, 2020
Google CEO Sundar Pichai said DeepMind's discovery will tackle new and hard problems. "DeepMind's incredible AI-powered protein folding breakthrough will help us better understand one of life's fundamental building blocks + enable researchers to tackle new and hard problems, from fighting diseases to environmental sustainability," he said.
.@DeepMind's incredible AI-powered protein folding breakthrough will help us better understand one of life’s fundamental building blocks + enable researchers to tackle new and hard problems, from fighting diseases to environmental sustainability. https://t.co/kpr8EAx34h— Sundar Pichai (@sundarpichai) November 30, 2020
What's the 'protein problem'?
In his acceptance speech for the 1972 Nobel Prize in Chemistry, Christian Anfinsen famously postulated that, in theory, a protein's amino acid sequence should fully determine its structure. This hypothesis sparked a five decade quest to be able to computationally predict a protein's 3D structure based solely on its 1D amino acid sequence as a complementary alternative to expensive and time consuming experimental methods.
A major challenge, however, is the number of ways a protein could theoretically fold before settling into its final 3D structure is astronomical. In 1969, Cyrus Levinthal noted that it would take longer than the age of the known universe to enumerate all possible configurations of a typical protein by brute force calculation. Yet in nature, proteins fold spontaneously, some within milliseconds - a dichotomy sometimes referred to as Levinthal's paradox.
In 1994, Professor John Moult and Professor Krzysztof Fidelis founded CASP as a biennial blind assessment to catalyse research, monitor progress, and establish the state of the art in protein structure prediction. It is both the gold standard for assessing predictive techniques and a unique global community built on shared endeavour.
The main metric used by CASP to measure the accuracy of predictions is the Global Distance Test (GDT) which ranges from 0-100, meaning the GDT can be thought of as the percentage of amino acid residues (beads in the protein chain) within a threshold distance from the correct position. According to Professor Moult, a score of around 90 GDT is informally considered to be competitive with results obtained from experimental methods.
How does DeepMind solve this complicated problem?
DeepMind first entered CASP13 in 2018, with its initial version of AlphaFold, which achieved the highest accuracy among participants. Afterwards, it published a paper on CASP13 methods in Nature with associated code, which has gone on to inspire other work and community-developed open source implementations. It has now driven changes in its methods for CASP14, enabling it to achieve "unparalleled levels" of accuracy.
DeemMind's 14th CASP assessment, released today, says its latest AlphaFold system achieved a median score of 92.4 GDT overall across all targets. This means that predictions have an average error (RMSD) of approximately 1.6 Angstroms, which is comparable to the width of an atom (or 0.1 of a nanometer). Even for the very hardest protein targets, those in the most challenging free-modelling category, AlphaFold achieves a median score of 87.0 GDT.