Molecule of the Month: Fifty Years of Open Access to PDB Structures

The Protein Data Bank is celebrating its golden anniversary!

By the end of the 1970s, structures were available in the PDB archive for proteins, nucleic acids, and polysaccharides. Hemoglobin (red, PDB ID 2dhb), transfer RNA (blue, PDB ID 6tna), and agarose (green, PDB ID 1aga) are shown here.
By the end of the 1970s, structures were available in the PDB archive for proteins, nucleic acids, and polysaccharides. Hemoglobin (red, PDB ID 2dhb), transfer RNA (blue, PDB ID 6tna), and agarose (green, PDB ID 1aga) are shown here.
Download high quality TIFF image
The Protein Data Bank began as a community-driven effort by crystallographers who wanted to share their work, and has since grown into a Global Archive managed by the Worldwide PDB partnership that ensures free access to curated biomolecular structures from laboratories around the world. In the past 50 years, the PDB archive has spurred growth and innovation in numerous fields, including discovery of effective pharmaceuticals to improve our health and prediction of the folding of proteins into functional structures. The PDB also provides a profound atomic-level understanding of the fundamental mechanisms of biology and evolution, for use in research and education. To help celebrate this milestone of 50 years of service, I have chosen PDB entries from each decade that exemplify the many, many fascinating structures that are available in the PDB archive. I invite you to spend some time exploring the archive to find some of your own favorites.

Structural Biology Begins

The first decade of the PDB was in the golden era of structural biology. The first protein structures had been determined in the 1960s, and the field rapidly expanded based on the experimental techniques developed for those landmark structures. The PDB was launched in 1971 with pioneering protein structures, giving the first glimpses of how polypeptide chains fold into defined, hierarchical 3D shapes. By the end of the first decade, structures were available in the PDB archive for all three common types of biopolymers: proteins, nucleic acids, and polysaccharides. With these structures, scientists laid the groundwork for atomic understanding of biomolecular structure, protein folding, enzyme catalysis, and genetic information transfer.

The structure of tomato bushy stunt virus (PDB ID 2tbv) showed how 180 copies of a single type of protein could assemble into a spherical capsid with icosahedral symmetry. The proteins adopt slightly different local structures depending on where they are: those around the five-fold axes are in yellow and those around the three-fold axes are in red and orange.
The structure of tomato bushy stunt virus (PDB ID 2tbv) showed how 180 copies of a single type of protein could assemble into a spherical capsid with icosahedral symmetry. The proteins adopt slightly different local structures depending on where they are: those around the five-fold axes are in yellow and those around the three-fold axes are in red and orange.
Download high quality TIFF image

Terrible Beauty

The second decade showed continued expansion of tools for X-ray crystallography and some of the first structures determined by solution NMR, leading to an explosion of new structures. We also got our first views of the atomic structures of viruses, building on advances in technology and taking advantage of their intrinsic symmetry. Capsid structures of icosahedral viruses like tomato bushy stunt virus (shown here) and the human pathogens poliovirus and rhinovirus revealed the atomic basis of the theory of quasisymmetry and provided important insights into the molecular basis of vaccine action. Near the end of the decade, structures of HIV protease helped change HIV infection from a deadly danger into a manageable chronic disease. Today, structural biology remains an essential tool to fight existing and emerging viral pathogens.

The structure of bacteriorhodopsin (magenta, PDB ID 2brd) was determined using electron crystallography from arrays of the protein in membranes. Lipids are shown in blue.
The structure of bacteriorhodopsin (magenta, PDB ID 2brd) was determined using electron crystallography from arrays of the protein in membranes. Lipids are shown in blue.
Download high quality TIFF image

Elusive Membrane Channels

The third decade of the PDB began with the first structure of a membrane-spanning protein, the light-driven proton pump bacteriorhodopsin. Membrane proteins are notoriously fickle, and the structure was the result of decades of work that yielded increasingly detailed views of the protein arrayed in membranes. Building on this success, many additional structures of membrane-spanning proteins soon followed and were made available in the PDB archive. Today, scientists have an extensive toolbox for determining structures of these challenging proteins, for example, isolating them in nanodiscs and determining structures using cryoelectron microscopy (cryoEM).

Two structures are combined to give a view of an entire bacterial ribosome: a crystallographic structure of the large (blue) and small (green) subunits (PDB ID 4v4q) and an NMR structure of the flexible protein stalk that is involved with gathering translation factors and tRNA (PDB ID 1rqv).
Two structures are combined to give a view of an entire bacterial ribosome: a crystallographic structure of the large (blue) and small (green) subunits (PDB ID 4v4q) and an NMR structure of the flexible protein stalk that is involved with gathering translation factors and tRNA (PDB ID 1rqv).
Download high quality TIFF image

Ribosomes Revealed

In the fourth decade of the PDB, a culmination of experimental advances allowed determination of the long-sought structure of ribosomes. Ground-breaking structures of the large and small subunits opened the door, and structures of ribosomes with translation factors, transfer RNA, messenger RNA, and other translational machinery soon followed, filling out the atomic details of the process of protein synthesis. In addition, numerous structures revealed the action of antibiotic drugs that target bacterial ribosomes. The story is far from over, however, and a continuing tidal wave of structures is revealing the many fascinating aspects of ribosome action, assembly and evolution. For example, recent structures of the expressome show how one bacterial mega-machine performs the entire process of transcription and translation.

CryoEM structure of portions of a bacterial flagellar motor (PDB ID 7cgo). The MS ring (orange) transmits torque to the rod and hook (blue) from the force-generating portion of the motor (not included in the structure), and the LP ring (green and yellow) acts as a bushing.
CryoEM structure of portions of a bacterial flagellar motor (PDB ID 7cgo). The MS ring (orange) transmits torque to the rod and hook (blue) from the force-generating portion of the motor (not included in the structure), and the LP ring (green and yellow) acts as a bushing.
Download high quality TIFF image

New Ways of Seeing Molecules

Many new experimental techniques have provided new views of the biomolecular world in the most recent decade of the PDB. Techniques like XFEL capture movies of molecular machines, for example revealing the nanoscale motions of riboswitches and watching chromophores as they absorb light. Solid state NMR reveals elusive targets like amyloid fibers. Integrative techniques give views of assemblies of unprecedented size and complexity, such as the nuclear pore complex. But arguably the most impactful advance has been the recent resolution revolution in cryoEM, which has put all manner of hitherto intractable molecular assemblies within reach. It's impossible to pick one structure to exemplify this booming field, but my current favorite is the structure shown here of the rotor and bushing portions of a bacterial flagellar motor. It provides a tantalizing glimpse of this showpiece of molecular evolution, and given the burgeoning pace of cryoEM research, an atomic-scale view of entire assembly is likely not far behind.

Exploring the Structure

Experimental and Predicted Structures of Myoglobin

As I write this article, the structural biology community is being transformed by the recent successes of AlphaFold2 and RoseTTAFold, which show a quantum leap in success rates for protein structure prediction. The example shown here is an easy one: the structure of human myoglobin predicted by AlphaFold2 (blue) is almost identical to the historic structure of whale myoglobin from Kendrew's laboratory (green, PDB ID 1mbn) and later structures of human myoglobin (click on the image for an interactive JSmol). Of course, there's only one reason that this structure is easy to predict: all predictive methods build on decades of structures available in the PDB archive. These predictive methods are a triumph of clever computing. They are also a triumph for the tens of thousands of researchers who have contributed to the PDB archive. The breadth of structural knowledge that is encompassed by their entries, and their willingness to make their structures freely available in the archive, made all this possible.

The same is true for the fields of drug discovery and development, vaccine development, enzyme engineering, bionanotechnology, and dozens more--all build on this goldmine of structural data to understand the basic principles of biomolecules, then apply them for new, breakthrough goals. Today, building on an explosion of new structure-determination techniques, the archive continues to grow rapidly. Who knows what will be possible? Exciting times are certainly ahead for the next 50 years of the PDB!

Topics for Further Discussion

  1. If you want to learn more about the history of the PDB, take a look at this Timeline of PDB History.
  2. Many publications and educational materials celebrating the PDB50 anniversary are available on the RCSB PDB website.

References

  1. Berman, H.M. (2021) Synergies between the Protein Data Bank and the community. Nat Struct Mol Biol 28: 400-401
  2. 7cgo: Tan, J., Zhang, X., Wang, X., Xu, C., Chang, S., Wu, H., Wang, T., Liang, H., Gao, H., Zhou, Y., Zhu, Y. (2021) Structural basis of assembly and torque transmission of the bacterial flagellar motor. Cell 184: 2665-2679.e19
  3. wwPDB consortium (2019) Protein Data Bank: the single global archive for 3D macromolecular structure data. Nucl Acids Res 47: D520-D528
  4. Goodsell, D.S., Burley, S.K., Berman, H.M. (2013) Revealing structural views of biology. Biopolymers 99: 817-824
  5. 4v4q: Schuwirth, B.S., Borovinskaya, M.A., Hau, C.W., Zhang, W., Vila-Sanjurjo, A., Holton, J.M., Cate, J.H. (2005) Structures of the bacterial ribosome at 3.5 A resolution. Science 310: 827-834
  6. 1rqv: Bocharov, E.V., Sobol, A.G., Pavlov, K.V., Korzhnev, D.M., Jaravine, V.A., Gudkov, A.T., Arseniev, A.S. (2004) From structure and dynamics of protein L7/L12 to molecular switching in ribosome. J Biol Chem 279: 17697-17706
  7. 2brd: Grigorieff, N., Ceska, T.A., Downing, K.H., Baldwin, J.M., Henderson, R. (1996) Electron-crystallographic refinement of the structure of bacteriorhodopsin. J Mol Biol 259: 393-421
  8. 2tbv: Hopper, P., Harrison, S.C., Sauer, R.T. (1984) Structure of tomato bushy stunt virus. V. Coat protein sequence determination and its structural implications. J Mol Biol 177: 701-713
  9. 6tna: Sussman, J.L., Holbrook, S.R., Warrant, R.W., Church, G.M., Kim, S.H.(1978) Crystal structure of yeast phenylalanine transfer RNA. I. Crystallographic refinement. J Mol Biol 123: 607-630
  10. 1aga: Arnott, S., Fulmer, A., Scott, W.E., Dea, I.C., Moorhouse, R., Rees, D.A. (1974) The agarose double helix and its function in agarose gel structure. J Mol Biol 90: 269-284
  11. Crystallography: Protein Data Bank. (1971) Nature New Biology 233: 223
  12. 2dhb: Bolton, W., Perutz, M.F. (1970) Three dimensional fourier synthesis of horse deoxyhaemoglobin at 2.8 Angstrom units resolution. Nature 228: 551-552
  13. 1mbn: Watson, H.C. (1969) The stereochemistry of the protein myoglobin. Prog Stereochem 4: 299

October 2021, David Goodsell

http://doi.org/10.2210/rcsb_pdb/mom_2021_10
About Molecule of the Month
The RCSB PDB Molecule of the Month by David S. Goodsell (The Scripps Research Institute and the RCSB PDB) presents short accounts on selected molecules from the Protein Data Bank. Each installment includes an introduction to the structure and function of the molecule, a discussion of the relevance of the molecule to human health and welfare, and suggestions for how visitors might view these structures and access further details.More