Learn
Paper Models
Flyers, Posters, & Calendars
Videos
Interactive Animations
Coloring Books
Structural Biology Highlights
3D Printing
Exploring the Structural Biology of Cancer
Exploring the Structural Biology of Bioenergy
Exploring the Structural Biology of Viruses
Exploring the Structural Biology of Health and Nutrition
Exploring the Structural Biology of Evolution
COVID-19 Pandemic Resources
Other Resources

Exploring the Structural Biology of Viruses

Structures of viral proteins help us discover effective ways to fight infection.

Viruses are a major threat to global health. Historically, pandemics of influenza, polio, smallpox and many other viruses have spread through populations numerous times, killing millions of people. Today, with our continually growing understanding of virus structure and biology, we have many tools to fight viral infection. Antiviral drugs block key viral proteins, preventing their replication and spread, and vaccines prime our immune system to make us ready for future exposure to common viruses.

This page explores some of the insights provided by structural biology about viruses and how these insights are used to develop new defenses against viral infection. Topics include:

  1. Viruses infect cells and force them to produce new viruses

  2. Viral genomes encode a limited set of proteins

  3. Viruses often build unusual polymerases to replicate their genomes

  4. Most viruses protect and deliver their genomes in symmetrical capsids

  5. Viruses rapidly evolve to infect new hosts and to become resistant to drug therapy

  6. Structural biology is essential for discovery of new antiviral drugs

  7. Vaccines help the immune system fight viral infection



1. Viruses infect cells and force them to produce new viruses

Viruses infect cells and force them to produce new viruses

Figure 1. Artistic conception of bacteriophage T4 (red) infecting a bacterial cell. The bacteriophage attaches to the surface of the cell and pierces the cell membrane, injecting its DNA genome (white strand) into the cell. Once inside, the cell’s polymerases and ribosomes will then build many new copies of the bacteriophage.

Viruses typically have two types of life cycles. “Lytic” viruses inject their genome into the cell, then make many new viruses using the cell’s resources, and finally burst the cell, releasing the viruses to infect neighboring cells. Lytic viruses typically are composed of a protein coat surrounding the genome, which can be composed of DNA or RNA. Examples of lytic viruses include bacteriophage T4 (shown here), poliovirus, rhinovirus, and adenovirus.

“Lysogenic” viruses, on the other hand, fuse with cell membranes and release their genome into the cell, and then new viruses bud from the surface of the cell. During this budding process, the new viruses capture a coating of the cell membrane. So, lysogenic viruses often have many layers: an outer membrane “envelope,” a protein capsid and other interior proteins, and the genome. Examples of lysogenic viruses include HIV, coronavirus, influenza virus, and ebolavirus.

PDB-101 includes artistic conceptions of the bacteriophage T4 life cycle and HIV budding from the surface of an infected cell. Note: the free, infectious form of a virus is often termed a “virion,” but here, we will use the term “virus” to encompass all stages of the viral life cycle.


2. Viral genomes encode a limited set of proteins

Viral genomes encode a limited set of proteins

Figure 2. The genome of porcine circovirus 2 includes two genes that encode a capsid protein (top) and a multifunctional replicase that includes a helicase domain (bottom) and a DNA-nicking domain. Amazingly, part of the gene for the replicase is also read backwards to create a third small protein that stimulates apoptosis in the infected cell. PDB ID 3r03 and 7lar.

Most viruses are much smaller than living cells and can only hold a small amount of genetic material. For example, circoviruses get by with only two genes: one to encode the capsid protein that will protect and deliver the genome in the infectious virus, and a replicase protein that hijacks cellular polymerases to create new copies of the viral single-stranded DNA genome. This means that viruses employ many economical strategies to maximize the use of their genetic information. Often, viral genes encode proteins with several functionalities, or genes for different proteins overlap with one another. As presented below, viruses also employ structural symmetry to build large structures with small, identical building blocks. In addition, viruses rely on cellular proteins to do most of the work of creating new viruses, and need only to encode proteins to hijack this machinery and shut down the normal functions of the cell.

PDB-101 includes presentations of the 15 proteins encoded in the HIV-1 genome, the 7 proteins in the ebolavirus genome, 4 proteins encoded by simian virus 40, and the ~26 proteins in the SARS-CoV-2 genome.








3. Viruses often build unusual polymerases to replicate their genomes

Viruses often build unusual polymerases to replicate their genomes

Figure 3: Poliovirus uses a single protein to replicate its RNA genome (bottom, with RNA template strand in orange and new RNA strand in yellow). SARS-CoV-2 encodes a multipart complex, including a polymerase (turquoise) and helper proteins (darker blue), a helicase (purple), and a proofreading enzyme (green), which together replicate the RNA genome and add a characteristic cap group to the end of some copies to make a messenger RNA. PDB ID 3ol8 and 7egq.

Viruses break every rule and use whatever mechanisms they need to hijack cells. Most notably, they often don’t use the traditional flow of information transfer, from DNA to RNA to protein. Instead, viral genomes can be carried in RNA or DNA, single- or double-stranded, and including coding information or complementary information. In order to use these different types of genomes, viruses often encode exotic polymerases that perform non-traditional replication and transcription tasks. For example, HIV delivers its genome as a single strand of RNA, and encodes a reverse transcriptase that builds a double-stranded DNA from it. The infected cell can then use this DNA copy in the way that it normally replicates and transcribes DNA. Other viruses don’t bother with DNA at all, doing everything with RNA. The viral genome is encoded in a strand of RNA, which includes instructions for building a polymerase that makes new RNA strands using RNA as the template. As shown here, this strategy is used by poliovirus and SARS-CoV-2.

4. Most viruses protect and deliver their genomes in symmetrical capsids

 Most viruses protect and deliver their genomes in symmetrical capsid

Figure 4. Bacteriophage P68 builds a capsid with a storage container that holds 18,227 base pairs of DNA (enough to encode 22 proteins), and machinery for recognition of bacterial cell surfaces and injection of DNA, using a symmetrical assembly of 9 types of proteins. PDB ID 6q3g and EMD4459.


As mentioned above, viral genomes are typically very small and can encode a limited number of proteins. However, viral capsids need to be large enough to enclose the entire genome. Viruses solve this problem by employing symmetry to build huge assemblies using a limited number of building blocks. Many virus capsids have icosahedral symmetry, building a hollow sphere to enclose the genome. Quasisymmetry, where one type of subunit is used in many slightly different structural contexts, is used to make the icosahedral shells even larger. Viruses like tobacco mosaic virus and ebolavirus use helical symmetry to enclose their genomes in a long tube of protein. Other viruses break from these regular symmetries to create more exotic structures, such as the cone-shaped capsid of HIV and the amazing structures of tailed bacteriophages like the one shown here, which still only requires encoding of 9 proteins in the genome.









5. Viruses rapidly evolve to infect new hosts and to become resistant to drug therapy

Viruses rapidly evolve to infect new hosts and to become resistant to drug therapy

Figure 5. Researchers watched HIV evolve in the laboratory in response to pressure from an antiviral drug. Viruses were grown in cell culture and subjected to increasing amounts of the drug. Mutant viruses quickly emerged, and after a few weeks, a highly resistant strain with six mutations dominated the population of viruses. PDB ID 2az8, 2az9, 2azb, 2azc.

During an infection, viruses reproduce in a matter of days to create a huge population of viruses. For example, in an individual infected with HIV, 10 billion new viruses are created every day, and the whole life cycle takes only 2 or 3 days. These are perfect conditions for rapid evolution, which leads to several important consequences. Firstly, viruses are often very specific for particular hosts. Their mechanisms for recognizing cells are often tailored for particular proteins found on the cell surface. For example, influenza virus uses hemagglutinins that recognize specific cell surface glycosylation. Mutation and evolution of these proteins, however, can allow viruses to infect new types of hosts. This is a common occurrence with influenza, where viruses in animal populations, such as birds or pigs, acquire the ability to infect humans.

Secondly, evolution of viral proteins allows them to become resistant to antiviral therapies. For example, resistant strains of HIV emerged very rapidly after the first anti-HIV drugs were deployed in the clinic. As shown in the figure, the molecular basis of this resistance could be observed within a matter of days in the laboratory setting. Today, HIV-infected individuals are treated with a cocktail of different drugs, making it much less probable that the viral population will be able to mutate to avoid all of them simultaneously. Similarly, variants of SARS-CoV-2 emerged rapidly in the global population as people became immune to earlier variants.


6. Structural biology is essential for discovery of new antiviral drugs

Structural biology is essential for discovery of new antiviral drugs

Figure 6. PF-00835231 is a first-generation inhibitor designed and tested to block the major protease of SARS-CoV. To fight the COVID19 pandemic, this inhibitor was optimized to create the antiviral drug nirmatrelvir, which blocks the similar major protease in SARS-CoV-2. Both inhibitors form a covalent linkage with a cysteine amino acid (yellow). PDB ID 6xhl and 7rfw.

By understanding the underlying molecular mechanisms of viral biology, we can find ways to block them and fight viral infection. Structural biology has provided a unique window on viruses, revealing their weak points. Fortunately, many viruses employ novel proteins that are quite different from our cellular proteins, so they are attractive targets for drug therapy. Many of these targets are viral enzymes that play key roles in the viral life cycle. For example, effective drugs that block HIV reverse transcriptase, HIV protease, and HIV integrase, all developed through design efforts guided by knowledge of atomic structures, have turned HIV infection into a manageable disease. Similarly, the structure of the major protease from SARS-CoV-2 guided design of a targeted drug to help fight the COVID-19 pandemic.

Therapeutics are also being designed to block the mechanisms of viral attachment and entry. These often employ therapeutic antibodies to bind to the surface glycoproteins of these viruses, using the same defenses that our immune system uses to fight viral infection.









7. Vaccines help the immune system fight viral infection

Vaccines help the immune system fight viral infection

Figure 7. Vaccines for SARS-CoV-2 encode the viral spike protein stabilized in the prefusion state, changing two amino acids to proline (red) to stabilize central alpha helices (white and magenta), and removing a furin cleavage site that is important for viral maturation (the loop is disordered in the structure, indicated here in yellow). PDB ID 7jji.

Vaccination is our most powerful tool for protecting us against the ravages of viruses. Vaccination leverages our natural defenses, priming our immune system so it will be ready when we are challenged with a virus. The concept is simple and revolutionary: we introduce a weakened (possibly inert) version of the virus into our system, not strong enough to cause disease but similar enough to the real virus to stimulate creation of antibodies against it. The first vaccines used the viruses themselves, inactivated by chemicals, or used a less dangerous virus similar to pathogenic one. Today, modern vaccines include only the most antigenic portions of the virus, typically the viral surface glycoproteins. The proteins themselves may be used as the vaccine or, more recently, mRNA vaccines can stimulate some of our own cells to produce the antigenic protein. Structures of these glycoproteins have recently been used to optimize them for maximal effectiveness, making small changes that stabilize it in the shape it adopts on the surface of an infectious virus. These “prefusion stabilized” forms were developed using the respiratory syncytial virus fusion glycoprotein, and later was used in the mRNA vaccines protecting us from SARS-CoV-2 infection.