[3dem] AlphaFold DB - free and open access to (millions of) protein structure predictions
Gerard Kleywegt
gerard at xray.bmc.uu.se
Thu Jul 22 08:11:59 PDT 2021
Dear colleagues and friends,
I would like to share some exciting news that has been announced today.
EMBL-EBI and DeepMind have co-developed the AlphaFold Protein Structure
Database (AlphaFold DB; https://urldefense.proofpoint.com/v2/url?u=https-3A__alphafold.ebi.ac.uk&d=DwIDaQ&c=-35OiAkTchMrZOngvJPOeA&r=L7-zyQ-04fFCMRqzLIOnx7H0exGZHwIQe_wMPuY600I&m=UiSo5wRZXOje9l7WNIWCSdWmWxqpRx7RA2w7mNSgWs4&s=zNOkVsJTkm_K0bEkqyRgZoq-jbFoUFOJzMV1c8DY_zQ&e= ), a joint project to openly
and freely share millions of AlphaFold protein-structure predictions with the
scientific community. The database launched officially at 4 pm UK time on 22
July. Today’s release contains approximately 365,000 structures (covering over
20 reference proteomes), which will increase to an estimated 130 million (!) 3D
models in the coming months (covering all UniProt sequence clusters with up to
90% mutual sequence identity, i.e. UniRef90). A Nature paper describing the
predictions for the human proteome and mentioning the new AlphaFold DB resource
was made public today: https://urldefense.proofpoint.com/v2/url?u=https-3A__www.nature.com_articles_s41586-2D021-2D03828-2D1&d=DwIDaQ&c=-35OiAkTchMrZOngvJPOeA&r=L7-zyQ-04fFCMRqzLIOnx7H0exGZHwIQe_wMPuY600I&m=UiSo5wRZXOje9l7WNIWCSdWmWxqpRx7RA2w7mNSgWs4&s=i-sJ9Do4kVHdx_wWBmyUAv1yDgi-sUgiOzE6Tome-dk&e=
The AlphaFold DB resource has been the work, carried out over a period of about
three months, of scientists, IT specialists, web designers, comms people etc.
at both EMBL-EBI and DeepMind, with the PDBe-KB team (https://urldefense.proofpoint.com/v2/url?u=https-3A__pdbe-2Dkb.org_&d=DwIDaQ&c=-35OiAkTchMrZOngvJPOeA&r=L7-zyQ-04fFCMRqzLIOnx7H0exGZHwIQe_wMPuY600I&m=UiSo5wRZXOje9l7WNIWCSdWmWxqpRx7RA2w7mNSgWs4&s=IV84A8RzX827l3yVdd_dpZDe-b1-Sk0x1Al8cTRPi00&e= ),
led by Sameer Velankar, playing a major role.
Given the accuracy demonstrated by AlphaFold models to date, this resource is
likely to have a major impact not only on structural biology but on many fields
of science and biotechnology. Soon, for the first time in history, for every
protein sequence known to science, there will be either an experimental
structure in the PDB, or a 3D model in AlphaFold DB, or a structure for a
protein within “homology-modelling distance” of a target protein. The source
code of AlphaFold has been made open as well, so predictions for completely new
and non-natural (designed) sequences can be generated by anybody who wants to.
Speaking from experience, it may take some time to wrap your head around the
sheer scale of the new resource and to ponder its potential impact on science.
A small group of leading structural biologists within EMBL have produced a
briefing document
(https://urldefense.proofpoint.com/v2/url?u=https-3A__www.embl.org_news_science_alphafold-2Dpotential-2Dimpacts_&d=DwIDaQ&c=-35OiAkTchMrZOngvJPOeA&r=L7-zyQ-04fFCMRqzLIOnx7H0exGZHwIQe_wMPuY600I&m=UiSo5wRZXOje9l7WNIWCSdWmWxqpRx7RA2w7mNSgWs4&s=rcaNMZKutgqun36JA6xEsC8pZmW7z6wX1x4CnazA9Hk&e= ) that describes
the technical achievements, the current limitations of AlphaFold and some of
the potential applications and opportunities for new research in a number of
(mainly structure-related) fields.
I for one am immensely excited and I invite you all to check out the new
resource.
Best wishes,
--Gerard
---
Gerard J. Kleywegt, EMBL-EBI, Hinxton, UK
Head of Molecular and Cellular Structure
gerard at ebi.ac.uk pdbe.org emdb-empiar.org
PA: Roisin Dunlop pdbe_admin at ebi.ac.uk
More information about the 3dem
mailing list