Molecule of the Month: Designed Proteins and Citizen Science

What if people with no formal experience in science could help to improve or even rewrite nature, simply by playing a game?

This article was written and illustrated by Changpeng Lu, Natalie Losada, and Nithish Selvaraj as part of a week-long boot camp on "Science Communication in Biology and Medicine" for undergraduate and graduate students hosted by the Rutgers Institute for Quantitative Biomedicine in January 2021.
Four proteins created by Foldit users (image generated with Pymol).
Four proteins created by Foldit users (image generated with Pymol).
Download high quality TIFF image

De novo Design, at a Glance

The Protein Data Bank (PDB) has been the central archive of biomolecular structures since 1971. These structures, of proteins, nucleic acids, and their various complexes have helped us understand structure-function relations in biology and learn the rules about how polypeptide chains fold into functional proteins. Using this knowledge, scientists have recently attempted de novo design of custom proteins to solve some 21st century problems like breaking down plastic or efficiently building specific drug molecules. Protein folding and design programs like AlphaFold and Rosetta can predict structures of the designed proteins, but these proteins may not always be stable when synthesized to test function. In these scenarios scientists have turned to gamers and citizen scientists for help in improving the protein designs.

A Solution: Foldit

Foldit is an interactive computer game that enables players to design protein sequences and predict their structures. Its easy-to-use interface enables players to manipulate structures based on spatial intuition, while following rules for stabilizing interactions such as hydrogen bonds and hydrophobic interactions, as set by the Rosetta program, to score the highest points in the game. To start a challenge, scientists upload protein folding puzzles with varied levels of difficulty - ranging from simple modifications of an already folded structure, to structure prediction of an entire de novo designed protein. Players use their 3D problem-solving skills to explore a unique range of structures that might be missed by purely computational approaches. Afterwards, when these proteins are synthesized by scientists and tested in laboratory experiments to confirm intended properties and function(s), users can see their predicted structures as real proteins.

The Impact of de novo Design

New protein folds have been a cornerstone of de novo design - for example, to engineer new enzymes and other functional proteins. Scientific groups have used algorithms like Rosetta to design new protein folds from scratch, such as Top7 (1qys, not shown). These new folds are the basis from which new enzymes are created, such as proteins that help fight off viral infections (7jzl and 3r2x, not shown) or the creation of molecules that can track specific chemicals in cells. Foldit players created 56 unique proteins from scratch, one of them even had a newly-discovered protein fold! Experimental structures of four of these new proteins are shown here (PDB ID 6mrr, 6nuk, 6msp and 6mrs). Thanks to new computational design tools, there are now more possibilities for de novo designed proteins than ever before.

An engineered Diels-Alderase enzyme (left) and an enzyme modified with Foldit (right). The designed loop (pink) stabilizes the ligand (green), increasing catalytic activity.
An engineered Diels-Alderase enzyme (left) and an enzyme modified with Foldit (right). The designed loop (pink) stabilizes the ligand (green), increasing catalytic activity.
Download high quality TIFF image

Protein Structure Prediction and Engineering

Tools developed for de novo protein design have also been very effective for structure prediction and optimization. Three examples are listed here. Using structure optimization, scientists engineered an efficient PET depolymerase (6tht, not shown) that performs the difficult chemical task of breaking down polyester plastics. After 10 years of trying, scientists were finally able to determine the structure of the Mason-Pfizer monkey virus (M-PMV) protease, with help from Foldit players (3sqf, not shown). The structure is ready to be used in targeted drug discovery. Finally, Foldit players were able to improve the function of an existing protein by increasing its catalytic activity more than 10-fold. The starting point was an engineered protein that performs an unusual Diels-Alder reaction (PDB ID 3i1c). Foldit players were guided by scientists to build a “lid” for the enzyme to hold the substrate more tightly for more efficient catalysis (PDB ID 3u0s).

Exploring the Structure

How is a protein’s structure stabilized?

For proteins to adopt their stable 3D structure, many different types of interactions occur between individual amino acids. Carbon-rich amino acid side chains are clustered inside the enzyme, forming a “hydrophobic core,” while amino acids with charged and polar side chains are most often arrayed on the surface of the protein, where they interact with the surrounding water. Specific interactions, such as ionic interactions, hydrogen bonds, and others further stabilize the protein and guide the local details of the polypeptide chain fold. Click on the image for an interactive JSmol that displays many of these interactions for a Foldit-designed protein (PDB ID 6nuk).

Topics for Further Discussion

  1. Try Fold.it yourself at their website.
  2. Try searching for “de novo” at the main RCSB PDB site to see many designed proteins in the PDB archive.

References

  1. Kohli, P., Jones, D.T., Silver, D., Kavukcuoglu, K., Hassabis, D. (2020) Improved protein structure prediction using potentials from deep learning. Nature 577: 706–710.
  2. Tournier, V., Topham, C.M., Gilles, A., David, B., Folgoas, C., Moya-Leclair, E., Kamionka, E., Desrousseaux, M.L., Texier, H., Gavalda, S., Cot, M., Guémard, E., Dalibey, M., Nomme, J., Cioci, G., Barbe, S., Chateau, M., André, I., Duquesne, S., Marty, A. (2020) An engineered PET depolymerase to break down and recycle plastic bottles. Nature 580: 216–219.
  3. 6mrs, 6mrr, 6msp, 6nuk: Koepnick, B., Flatten, J., Husain, T., Ford, A., Silva, D., Bick, M., Bauer, A., Liu, G., Ishida, Y., Boykov, A., Estep, R., Kleinfelter, S., Nørgård-Solano, T., Wei, L., Foldit Players, Montelione, G. T., DiMaio, F., Popović, Z., Khatib, F., Cooper, S., Baker, D. (2019) De novo protein design by citizen scientists. Nature 570: 390–394.
  4. Feng, J., Wester, B. W., Tinberg, C. E., Mandell, D. J., Antunes, M. S., Chari, R., Morey, K. J., Rios, X., Medford, J. I., Church, G. M., Fields, S., Baker, D. (2015) A General Strategy to Construct Small Molecule Biosensors in Eukaryotes. ELife 4.
  5. 3u0s: Eiben, C.B., Siegel, J.B., Bale, J. B., Cooper, S., Khatib, F., Shen, B.W., Foldit Players, Stoddard, B.L., Popovic, Z., Baker, D. (2012) Increased Diels-Alderase activity through backbone remodeling guided by Foldit players. Nature Biotechnology 30(2): 190-192.
  6. Cooper, S., Khatib, F., Treuille, A., Barbero, J., Lee, J., Beenen, M., Leaver-Fay, A., Baker, D., Popovic, Z., Foldit players. (2010) Predicting protein structures with a multiplayer online game. Nature 466: 756–760.
  7. 3i1c: Siegel, J.B., Zanghellini, A., Lovick, H.M., Kiss, G., Lambert, A.R., St Clair, J.L., Gallaher, J.L., Hilvert, D., Gelb, M.H., Stoddard, B.L., Houk, K.N., Michael, F.E., Baker, D. (2010) Computational design of an enzyme catalyst for a stereoselective bimolecular Diels-Alder reaction. Science 329: 309-313.
  8. Senior, A.W., Evans, R., Jumper, J., Kirkpatrick, J., Sifre, L., Green, T., Qin, C., Zidek, A., Nelson, A.W.R., Bridgland, A., Penedones, H., Petersen, S., Simonyan, K., Crossan, S., Simons, K.T., Bonneau, R., Ruczinski, I.,Baker, D. (1999) Ab initio protein structure prediction of CASP III targets using ROSETTA. Proteins 37: 171-176.

July 2021, Changpeng Lu, Natalie Losada, Nithish Selvaraj, David S. Goodsell, Shuchismita Dutta

http://doi.org/10.2210/rcsb_pdb/mom_2021_7
About Molecule of the Month
The RCSB PDB Molecule of the Month by David S. Goodsell (The Scripps Research Institute and the RCSB PDB) presents short accounts on selected molecules from the Protein Data Bank. Each installment includes an introduction to the structure and function of the molecule, a discussion of the relevance of the molecule to human health and welfare, and suggestions for how visitors might view these structures and access further details.More