To close out the year, Baker Lab scientists published a new report describing the creation of proteins that mimic DNA. We believe this breakthrough will aid the creation of bioactive nanomachines.
DNA is a widely used building material at the nanoscale because it is simple and predictable: A pairs with T and C pairs with G. Because of this, DNA strands can be programmed to click together into precise and increasingly complex structures. But DNA has drawbacks. It is not as bioactive as RNA, and not nearly as active as proteins. Bioactive protein assemblies run cells (kinetochores, polymerases, proteasomes, etc). What if designing them was as easy as clicking together DNA?
Using computational design, we created heterodimeric proteins that form double helices with hydrogen-bond mediated specificity. When a pool of these new protein zippers gets melted and then allowed to refold, only the proper pairings form. They are all-against-all orthogonal. With these new tools in hand, we can now begin constructing large protein-based machines that self-assemble in predictable ways.
This project was led by graduate student Zibo Chen and was done in collaboration with the Wysocki Lab at Ohio State University and the Sgourakis Lab at the UC Santa Cruz. The work used support from the SIBYLS program with SAXS and the ALS resources at LBNL, as well as the Argonne Leadership Computing Facility at ANL.
The basic parts of proteins — helices, strands and loops — can be combined in countless ways. But certain combinations are trickier than others. This week scientists from the IPD, along with collaborators in Brno and Santa Cruz, published the first-ever example of designed non-local beta-strand interactions.
Beta-sheet proteins carry out critical functions in biology, and hence are attractive scaffolds for computational protein design, but the de novo design of all-beta-sheet proteins from first principles has lagged far behind the design of all-alpha or mixed-alpha-beta domains.
Local beta-strand interactions occur when residues near one other hydrogen bond to form compact sheets. To get similar interactions from stretches of residues that are not close in primary sequence, a protein backbone must fold into a complex interwoven shape. The successful design of non-local beta-strand interactions demonstrates a significant advance in our ability to control both fine features (such as precise hydrogen bonding) and global features (such as complex topology) in proteins and opens the door to the design of a broad range of non-local beta-sheet structures.
By studying loops that connect unpaired beta-strands (beta-arches), the team identified a series of structural relationships between loop geometry, side chain directionality and beta-strand length that arise from hydrogen bonding and packing constraints on regular beta-sheet structures. They used these rules to de novo design jellyroll structures with double-stranded beta-helices formed by eight antiparallel β-strands. NMR of a hyperthermostable design closely matched the computational model, demonstrating accurate control over the beta-sheet structure and loop geometry.
In the summer of 1961, Osamu Shimomura drove across the country in a cramped station wagon to scoop jellyfish from the docks of Friday Harbor. He wanted to discover what made them glow.
It took Shimomura and other biochemists more 30 years to find a full answer. By then, recombinant DNA technology allowed researchers to clone and characterize the two proteins responsible: aequorin and GFP. The latter would earn Shimomura his share of the 2008 Nobel Prize.
GFP, a 238-residue beta-barrel with a covalently linked chromophore, transformed how scientists study cells and the molecules in them. As a genetic tag, GFP has illuminated the inner workings of human brain cells, bacteria, fungi and more.
This week, scientists from the IPD report in Nature the design of a completely artificial fluorescent beta-barrel protein.
Many natural proteins evolved to bind small molecules. Reengineering such proteins is rarely straightforward, limiting how they can be applied. The new findings demonstrate that proteins unlike any found in nature can be rationally-designed to bind to and act on specific small molecules with high precision and affinity.
The lead authors of the paper are Jiayi Dou, Ph.D and Anastassia A. Vorobieva, Ph.D., then both senior fellows in the Baker lab.
Anastassia Vorobieva with her son Alexandre (left) and Jiayi Dou (right).
To make the fluorescent protein the researchers had to achieve another first: Creating beta-barrels from scratch. The fold was ideal because one end of its cylindrical shape could be used to stabilize the protein while the other could be used to create a cavity that would serve as the binding site for the target molecule, DFHBI. In nature, beta-barrels proteins bind a wide range of small molecules.
Rosetta was used to design the scaffold de novo. To create the cavity, the team used a new docking algorithm called the “Rotamer Interaction Field” or RIF, developed by William Sheffler, Ph.D., a senior research scientist in the Baker lab, that rapidly identifies all potential structures of cavities that fulfill the prerequisites for binding specific molecules.
The designed protein absorbs blue and emits cyan light. It is stable up to 75°C.
“It worked in bacterial, yeast and mammalian cells,” said Dou, “and being half the size of green fluorescent protein should be very useful to researchers.”
“Equally important,” Baker added, “it greatly advances our understanding of the determinants of protein folding and binding beyond what we have learned from describing existing protein structures.”
The computational design of transmembrane proteins with more than one membrane-spanning region remains a major challenge. We report the design of transmembrane monomers, homodimers, trimers, and tetramers with 76 to 215 residue subunits containing two to four membrane-spanning regions and up to 860 total residues that adopt the target oligomerization state in detergent solution. The designed proteins localize to the plasma membrane in bacteria and in mammalian cells, and magnetic tweezer unfolding experiments in the membrane indicate that they are very stable. Crystal structures of the designed dimer and tetramer—a rocket-shaped structure with a wide cytoplasmic base that funnels into eight transmembrane helices—are very close to the design models. Our results pave the way for the design of multispan membrane proteins with new functions.
Today marks another major step forward for peptide based drug discovery. IPD researchers report in Science the computational design of a new world of small cyclic peptides, “Macrocycles”, increasing the number of the known kinds of these molecules by multiple fold. The conceptual art image below “Illuminating the energy landscape” shows the power of computational design to explore and illuminate structured peptides across the vast energy landscape.
Small peptides have the benefits of small molecule drugs, like aspirin, and large antibody therapies, like rituximab, with fewer drawbacks. They are stable like small molecules and potent and selective like antibodies.
Abstract reads as follows.
Mixed-chirality peptide macrocycles such as cyclosporine are among the most potent therapeutics identified to date, but there is currently no way to systematically search the structural space spanned by such compounds. Natural proteins do not provide a useful guide: Peptide macrocycles lack regular secondary structures and hydrophobic cores, and can contain local structures not accessible with L-amino acids. Here, we enumerate the stable structures that can be adopted by macrocyclic peptides composed of L- and D-amino acids by near-exhaustive backbone sampling followed by sequence design and energy landscape calculations. We identify more than 200 designs predicted to fold into single stable structures, many times more than the number of currently available unbound peptide macrocycle structures. Nuclear magnetic resonance structures of 9 of 12 designed 7- to 10-residue macrocycles, and three 11- to 14-residue bicyclic designs, are close to the computational models. Our results provide a nearly complete coverage of the rich space of structures possible for short peptide macrocycles and vastly increase the available starting scaffolds for both rational drug design and library selection methods.
Published today in Nature, IPD researchers describe the first synthetic protein assemblies — dubbed synthetic nucleocapsids — that encapsulate their own genome and evolve in complex environments.
Synthetic nucleocapsids are built to resemble viral capsids and could be used in future to deliver therapeutics to specific cells and tissues. These icosahedral protein assemblies are based off of previously reported results from the Institute for Protein Design.
The image above visualizes the de novo creation of synthetic nucleocapsids from computationally designed proteins and their evolution to acquire properties that could be useful for drug delivery and other biomedical applications. The narrative was designed as a futuristic hologram projection realized through spiraling DNA composed of binary zeros and ones. The projection and computational imagery evoke futuristic technology and design, while calling out natural evolution through the DNA spiral “time-scale” motif. The heads-up display iconography showing a blood bag, mouse, and RNase A convey the challenges we used to evolve the synthetic nucleocapsids. The single net impression of this image is engaging + enlightening and shows that we are entering the next epoch of synthetic biology in which biological systems can be designed and created from scratch.
The challenges of evolution in a complex biochemical environment, coupling genotype to phenotype and protecting the genetic material, are solved elegantly in biological systems by the encapsulation of nucleic acids. In the simplest examples, viruses use capsids to surround their genomes. Although these naturally occurring systems have been modified to change their tropism and to display proteins or peptides, billions of years of evolution have favoured efficiency at the expense of modularity, making viral capsids difficult to engineer. Synthetic systems composed of non-viral proteins could provide a ‘blank slate’ to evolve desired properties for drug delivery and other biomedical applications, while avoiding the safety risks and engineering challenges associated with viruses. Here we create synthetic nucleocapsids, which are computationally designed icosahedral protein assemblies with positively charged inner surfaces that can package their own full-length mRNA genomes. We explore the ability of these nucleocapsids to evolve virus-like properties by generating diversified populations using Escherichia coli as an expression host. Several generations of evolution resulted in markedly improved genome packaging (more than 133-fold), stability in blood (from less than 3.7% to 71% of packaged RNA protected after 6 hours of treatment), and in vivo circulation time (from less than 5 minutes to approximately 4.5 hours). The resulting synthetic nucleocapsids package one full-length RNA genome for every 11 icosahedral assemblies, similar to the best recombinant adeno-associated virus vectors. Our results show that there are simple evolutionary paths through which protein assemblies can acquire virus-like genome packaging and protection. Considerable effort has been directed at ‘top-down’ modification of viruses to be safe and effective for drug delivery and vaccine applications; the ability to design synthetic nanomaterials computationally and to optimize them through evolution now enables a complementary ‘bottom-up’ approach with considerable advantages in programmability and control.
Mark your calendars! September 27, 2017 is the day the doors opened to whole new world of targeted therapeutics. The Baker lab and numerous talented collaborators published in Nature that it is now possible to conduct “Massively parallel de novo protein design for targeted therapeutics”. Three factors make this possible: Rosetta molecular modeling algorithms for computational protein design, economical computing power, and inexpensive gene write – read technology. Designer therapeutic mini-proteins have arrived!
The group designed and tested 22,660 mini-proteins of 37–43 residues that target influenza haemagglutinin and botulinum neurotoxin B, along with 6,286 control sequences to probe contributions to folding and binding, and identified 2,618 high-affinity mini-binders. Comparison of the binding and non-binding design sets, which are two orders of magnitude larger than any previously investigated, enabled the evaluation and improvement of the computational model. Biophysical characterization of a subset of the binder designs showed that they are extremely stable and, unlike antibodies, do not lose activity after exposure to high temperatures. The designs elicit little or no immune response and provide potent prophylactic and therapeutic protection against influenza, even after extensive repeated dosing. This design capability opens the door to a whole new future of genetically encoded, tailor made protein therapeutics. Its a bright new day.
The Matrix movie (1999) depicts a future in which the reality perceived by most humans is actually a computer simulated reality called “the Matrix”. Published today in Science, the Baker lab and collaborators report on a new kind of Matrix – a new reality for large scale computational protein design which can achieve massive data driven improvements in our ability to design highly stable, small proteins from scratch.
Following the White Rabbit, Postdoctoral fellow Dr. Gabe Rocklin led a group of scientists to design and test over 15,000 new mini-proteins (which do not exist in nature) to see whether they form stable folded structures. Even major protein design studies in the past few years have generally examined only 50 to 100 designs. Synthetic DNA technology and high throughput screening permitted the group to conduct large-scale testing of structural stability of multitudes of computationally designed proteins. In turn, this allows them to perform a “Global analysis of protein folding using massively parallel design, synthesis and testing“.
Through iterative improvements in the design process, the group arrived at 2,788 stable mini-protein structures, which is at least 50-fold more proteins than have ever been characterized from natural sources for similar sized proteins. Their small size and stability may be advantageous for treating diseases when the drug needs to avoid the immune system and reach the inside of a cell.
The publicationAbstractis a step into the Matrix as Morpheus explains,
Proteins fold into unique native structures stabilized by thousands of weak interactions that collectively overcome the entropic cost of folding. Though these forces are “encoded” in the thousands of known protein structures, “decoding” them is challenging due to the complexity of natural proteins that have evolved for function, not stability. Here we combine computational protein design, next-generation gene synthesis, and a high-throughput protease susceptibility assay to measure folding and stability for over 15,000 de novo designed miniproteins, 1,000 natural proteins, 10,000 point-mutants, and 30,000 negative control sequences, identifying over 2,500 new stable designed proteins in four basic folds. This scale — three orders of magnitude greater than that of previous studies of design or folding—enabled us to systematically examine how sequence determines folding and stability in uncharted protein space. Iteration between design and experiment increased the design success rate from 6% to 47%, produced stable proteins unlike those found in nature for topologies where design was initially unsuccessful, and revealed subtle contributions to stability as designs became increasingly optimized. Our approach achieves the long-standing goal of a tight feedback cycle between computation and experiment, and promises to transform computational protein design into a data-driven science.
Researchers in the Baker lab at the Institute for Protein Design, working in collaboration with the Joint Genome Institute, published inScience the solved folds and structures for hundreds of protein families. This “big data” approach to large scale protein structure determination was made possible by a team effort that analyzed billions of gene sequences read out from soil, ocean, and air samples collected around the globe.
The research has been recognized by numerous opinion leaders and media outlets as an unprecedented breakthrough for protein structure prediction. See articles in The Atlantic, The Economist , Science,GeekWire, and GEN.
How does it work?
As illustrated in Figure 1, the sequencing of DNA from environmental samples produces billions of new protein amino acid sequences. Computer algorithms are used to align the sequences according to their evolutionary history. This allows the discovery of pairs of amino acids that co-evolve. If a change occurs in one amino acid, then a compensatory change is typically observed in another amino acid in the sequence. Co-evolving pairs of amino acids are almost always in close proximity to each other (green and yellow lines) within in the final 3D structure of the protein structure (white backbone).
Why is it important?
With this approach, the team produced reliable models for 622 protein families, and discovered more than 100 new protein folds. In addition to resolving the folding structure of a protein, as shown in Figure 2 co-evolution data can also provide data on the dynamic nature of protein structure including transient contacts, protein-protein contacts, and contacts with ligands. Over time, as more environmental DNA sequence data becomes available, we expect to greatly increase our understanding of protein structure, assembly, and function. In turn, we expect this information to enable the design of new proteins with functions.
The Institute for Protein Design believes in sharing its insights with the rest of the world and we have made publicly available the database of protein structures resolved by these methods.
The latest paper coming out from the IPD was published today on the Science website. It’s titled “Principles for designing proteins with cavities formed by curved β sheets” with first co-authors Enrique Marcos and Benjamin Basanta, a former and current IPD member, respectively. Other IPD members on the paper include Tamuka Chidyausiku, Gustav Oberdorfer, Daniel-Adriano Silva, Jiayi Dou, and David Baker. Dr. Baker wrote a summary about the publication:
Some of the key functions of the proteins in our bodies and in all living things are to catalyze chemical reactions—speed up the rates by many orders of magnitude-and to sense and respond to small molecules in the body and in the environment. New proteins that catalyze chemical reactions and/or sense and respond to compounds not found in nature would have wide use in medicine and industry.
Computational protein design can in principle be used to generate such new catalysts and receptors, but a major challenge to accomplishing this has been the inability to design proteins with cavities within which the catalysis or small molecule binding can take place. This paper describes a general approach for designing proteins with cavities with tunable size and shape. The method opens the door to design of new catalysts and binding proteins [by generating proteins with appropriately sized and shaped cavities to hold the small molecule and lining the cavity with amino acid functional groups to carry out catalysis and/or binding].
Published today in Science Philanthropy Alliance, David Baker, Director of the Institute for Protein Design describes how the opportunities for computational protein design are endless — with new research frontiers and a huge variety of practical applications to be explored, from medicine to energy to technology.
This is an exciting time as we are undergoing a technological revolution in protein design—rather than simply tweaking proteins that have come through the evolutionary process, we are becoming able to design new proteins from scratch to solve current challenges.
Small constrained peptides combine the stability of small molecule drugs with the selectivity and potency of antibody-based therapeutics. However, peptide-based therapeutics have largely remained underexplored due to the limited diversity of naturally occurring peptide scaffolds, and a lack of methods to design them rationally. New computational design and wet lab methods developed at the Institute for Protein Design have now opened the door to rational design of awhole new world of hyper-stable drug-like peptide structures.
In an article published in Nature this week, Baker lab / IPD scientists and their collaborators describe the development of computational methods for de novo design of constrained peptides with exceptional stabilities. They used these computational methods to design 18-47 residue constrained peptides with diverse shapes and sizes. The designed peptides presented in the paper cover three broad categories:
2) synthetic disulfide cross-linked peptides with non-canonical sequences, and
3) cyclic peptides with non-canonical backbones and sequences.
Experimentally determined structures for these peptides are nearly identical to their design models.
By including D-amino acids (mirror images of the L-amino acids), and thus expanding the palette of building blocks, Baker lab scientists designed peptides in a sequence and structure space sampled rarely by Nature. Indeed, the article describes successful design of a cyclic 2-helix peptide of mix chirality that represents a shape beyond natural secondary- and tertiary structure.
These designed peptides also exhibit exceptional stability to thermal and chemical denaturation, and thus could serve as attractive scaffolds for design of novel peptide-based therapeutics. More broadly, development of this new computational toolkit to precisely design constrained peptides opens the door for “on-demand” development of a new generation of peptide-based therapeutics. See In the Pipeline.
The National Institutes of Health provided partial support for this work through grants P50 AG005136, T32-H600035., GM094597, GM090205, and HHSN272201200025C. Additional funding was provided by The Three Dreamers.
Earlier this month, Baker lab researchers reported the computational design of a hyperstable 60-subunit protein icosahedron in Nature (Hsia et al); icosahedral protein structures are commonly observed in natural biological systems for packaging and transport (e.g. viral capsids). The described design was composed of 60 trimeric protein building blocks that self-assembled in a nanocage.
In new work published today, Baker lab scientists and collaborators have taken this work to an exciting new level by engineering 120-subunit icosahedral nanocages that self-assemble from not one, but two distinct protein components. The new designed proteins are described in the latest issue of Science in a paper entitled “Accurate design of megadalton-scale multi-component icosahedral protein complexes”.
In this paper, former Baker lab graduate student Jacob Bale, Ph.D. and collaborators describe the computational design and experimental characterization of ten two-component protein complexes that self-assemble into nanocages with atomic-level accuracy. These nanocages are the largest designed proteins to date with molecular weights of 1.8-2.8 megadaltons and diameters comparable to small viral capsids. The structures have been confirmed by X-ray crystallography (see figure). The advantage of a multi-component protein complex is the ability to control assembly by mixing individually prepared subunits. The authors show that in vitro mixing of the designed subunits occurs rapidly and enables controlled packaging of negatively charged GFP by introducing positive charges on the interior surfaces of the two copmonents.
The ability to design, with atomic-level precision, these large protein nanostructures that can encapsulate biologically relevant cargo and that can be genetically modified with various functionalities opens up exciting new opportunities for targeted drug delivery and vaccine design. A link to the paper and additional information is below:
Nature provides many examples of self- and co-assembling protein-based molecular machines, including icosahedral protein cages that serve as scaffolds, enzymes, and compartments for essential biochemical reactions and icosahedral virus capsids, which encapsidate and protect viral genomes and mediate entry into host cells. Inspired by these natural materials, we report the computational design and experimental characterization of co-assembling two-component 120-subunit icosahedral protein nanostructures with molecular weights (1.8-2.8 MDa) and dimensions (24-40 nm diameter) comparable to small viral capsids. Electron microscopy, SAXS, and X-ray crystallography show that ten designs spanning three distinct icosahedral architectures form materials closely matching the design models. In vitro assembly of independently purified components reveals rapid assembly rates comparable to viral capsids and enables controlled packaging of molecular cargo via charge complementarity. The ability to design megadalton-scale materials with atomic-level accuracy and controllable assembly opens the door to a new generation of genetically programmable protein-based molecular machines.
Today at 11am, the paper titled “A Computationally Designed Hemagglutinin Stem-Binding Protein Provides In Vivo Protection from Influenza Independent of a Host Immune Response” was published to the PLOS Pathogen website. This paper was contributed to by several IPD members, including Aaron Chevalier, Jorgen Nelson, Lance Stewart, Lauren Carter, and David Baker. The research was performed in collaboration with colleagues in Deborah Fuller’s lab at UW’s Department of Microbiology. You can find the paper here. Scroll down for the official press release.
Fighting Flu with Designer Drugs: A New Compound Given Before or After Exposure Fends Off Different Influenza Strains
A study published on February 4th in PLOS Pathogens reports that a new antiviral drug protects mice against a range of influenza virus strains. The compound seems to act superior to Oseltamivir (Tamiflu) and independent of the host immune response.
Influenza viruses under the microscope look a bit like balls covered with spikes. The spikes are actually two different proteins, hemagglutinin (HA) and neuraminidase (NA). Both proteins consist of an inner stem region (which doesn’t differ much between flu strains), and a highly variable outer blob. The individual variants fall into designated groups, and this is how flu strains are categorized (for example as H1N1, or H3N5).
Ongoing mutations that change the HA and NA blobs are the reason why flu vaccines differ from season to season; they are based on researchers’ best guesses of what next year’s prominent strains will look like. And dangerous pandemic strains often have radically new blobs against which existing immunity is limited.
In the search for drugs that act broadly against different influenza strains, researchers had previously shown that antibodies against the HA stem region can prevent influenza infection. Such antibodies are protective, at least in part, because they activate the host immune response which then destroys the whole HA/antibody complex. The approach, then, depends on a fully functional immune system—which is not present in infants, the elderly, or immune-compromised individuals.
Inspired by the earlier work, Deborah Fuller from the University of Washington in Seattle, USA, who is interested in developing influenza drugs and vaccines, teamed up with David Baker, also at the University of Washington, who is an expert in computational protein design. Together with colleagues, they set out to design small molecules that—like the protective antibodies—bind to the HA stem, and to test whether these small molecules can protect against influenza infection. Designed to mimic antibodies, the small molecules bind the virus in a similar manner. However, because they don’t engage the immune system the way antibodies do, and because of questions of stability and potency, it was not clear whether they would be able to prevent infection in animals, or eventually, in humans.
Before testing their molecules in animals, the researchers optimized their favorite small molecule candidate by systematically generating thousands of versions and testing how tightly they bound HA stems from seven different influenza strains. As they predicted, the resulting molecule, called HB36.6, protected cells against influenza virus infection in vitro (i.e., in test tubes).
The researchers next tested HB36.6 in “challenge experiments” in mice. They gave mice a single intranasal dose of the drug and 2 hours, 24 hours, or 48 hours later injected them with a normally lethal dose of influenza virus. This one-time HB36.6 treatment, when given up to 48 hours before the challenge, conveyed complete protection: All of the treated mice survived and had little weight loss, whereas all untreated control mice died after losing a third of their body weight or more. Intranasal HB36.6 was also able to protect mice after they had been exposed to flu virus, when administered either as a single dose within a day after exposure, or when it was given daily for four days starting 24 hours after exposure.
This protection does not depend on an intact host immune response. When the researchers repeated the challenge experiments in two different immune-deficient mouse strains, they found that HB36.6 can protect these mice as well.
Comparing HB36.6 with Oseltamivir, the researchers found that a single dose of HB36.6 provided better protection than 10 doses (twice daily for 5 days) of Oseltamivir. Furthermore, when they gave a low dose of HB36.6 post-infection (which by itself was not able to afford full protection) together with twice-daily doses of Oseltamivir, all the mice survived, indicating a synergistic effect when the two antiviral drugs are combined.
Their results, the researchers conclude, “show that computationally designed proteins have potent anti-viral efficacy in vivo and suggests promise for development of a new class of HA stem-targeted antivirals for both therapeutic and prophylactic protection against seasonal and emerging strains of influenza”.
Custom design with atomic level accuracy enables researchers to craft a whole new world of proteins
Naturally occurring proteins are the nanoscale machines that carry out essentially all of the critical functions in living things.
While it has been known for over 40 years that the sequence of amino acids completely determines the shape of the protein, it has been very challenging to predict from the amino acid sequence of the protein its three-dimensional structure, and conversely, to come up with brand new amino acid sequences which fold up into hitherto unseen structures.
Over the past months, scientists at the Institute for Protein Design at the University of Washington and the Fred Hutch, along with colleagues at other institutions, have reported advances in two long-standing problem areas related to the construction of new proteins from scratch.
“It has been a watershed year for protein structure prediction and design,” said UW Medicine researcher David Baker, a University of Washington professor of biochemistry, Howard Hughes Medical Institute investigator and head of the Institute for Protein Design.
The protein structure problem is about figuring out how a protein’s chemical makeup predetermines its molecular structure, and in turn, its biological role. UW researchers have developed powerful new algorithms using co-evolution data from DNA sequences to make unprecedented highly accurate blind ‘ab initio’ structure predictions of large proteins (>200 amino acids in length). This has opened the door to accurate prediction of the structures for hundreds of thousands of newly discovered proteins in the ocean, soil, and gut microbiome.
Equally difficult is the second problem, which is designing amino acid sequences that will fold into brand new protein structures. Breakthroughs demonstrate that it is now possible to make brand new amino acid sequences with exacting precision for folds inspired by the natural world; and more importantly to make amino acid sequences from scratch for totally novel unknown folds, far surpassing what is predicted to occur in natural proteins.
The new proteins are designed with the help of volunteers around the world participating in the Rosetta@Home distributed computing project. The designed amino acid sequences are encoded in synthetic genes, the proteins are produced in the laboratory, and their structures determined with X-ray crystallography. The computer models in almost all cases match the experimentally determined crystal structures with near atomic level accuracy.
Researchers report new protein designs for barrels, sheets, rings, and screws –all with near atomic level accuracy. This builds on previous reports of designed protein cubes and spheres; providing proof that it is possible to make a totally new class of protein materials.
With these advances in both protein structure prediction and molecular design, Institute for Protein Design researchers hope to build a new world of proteins with exact specifications for performing critically needed tasks in medical, environmental and industrial arenas.
Examples of their goals are nanoscale tools that:
boost the immune response against HIV and other recalcitrant viruses
block the flu virus so that it can’t infect cells
deliver drugs to cancer cells with precision and reduced side effects
stop allergens from causing symptoms
neutralize deposits, called amyloids, thought to damage vital tissues in Alzheimer’s disease
mop up medications in the body as an antidote
fulfill other diagnostic, therapeutic, and clean energy needs
Just as the manufacturing industry was revolutionized by creating interchangeable parts designed to precise specifications, custom designed protein modules with the right twist, turns, and connections for their modular assembly is a bold new direction for biotechnology.
Results providing proof of this possible future have been reported in recent months by researchers the UW Institute for Protein Design in collaboration with researchers at the Fred Hutch, Max Planck Institute for Developmental Biology, Janelia Research Campus, and the Institute for Molecular Science in Japan.
Evolution offers clues to shaping proteins: The function of many proteins tends to stay the same across species, even after their amino acid sequences have changed over billions of years of evolution. Locating co-evolved pairs of amino acids helps calculate their proximity when the molecule folds. UW graduate student Sergey Ovchinnikov applied this co-evolution DNA sequence analysis in an E-Life paper published on September 3, 2015 entitled “Large-scale determination of previously unsolved protein structures using evolutionary information” that illuminated for the first time the structures of 58 families of proteins containing hundreds of thousands of additional structurally related family members.
“This achievement was a grand slam home run in the history of protein structure prediction,” said Baker.
Modular construction of proteins with repeating motifs: Proteins composed of repeated modules, similar to interlocking Lego® blocks, are common in the natural world. Two papers published in the December 16 issue of Nature entitled, “Exploring the repeat protein universe through computational protein design,” and “Rational design of alpha-helical tandem repeat proteins with closed architectures,” shows that existing repeat proteins occupy only a small fraction of the available space, and that it is possible to design totally new proteins with precisely specified geometries that go far beyond what nature has achieved. The work was led by postdoctoral fellows TJ Brunette, Fabio Parmeggiani and Po-Ssu Huang in the lab of David Baker at the University of Washington Institute for Protein Design and Lindsey Doyle and Phil Bradley at the Fred Hutchinson Cancer Research Institute in Seattle.
Barrel-fold design: , Baker lab postdoctoral fellow Po-Ssu Huang, together with Birte Höcker at the Max Planck Institute for Developmental Biology (Tübingen, Germany) discovered the critical but elusive design principles for a barrel-shaped fold underpinning many natural enzyme molecules. The custom designed barrels folds were built at the Institute for Protein Design and reported on November 23, 2015 in the Nature Chemical Biology paper, “De novo design of a four-fold symmetric TIM-barrel protein with atomic-level accuracy.” This breakthrough has opened the door for bioengineers to generate totally new enzymes that speed up chemical reactions by positioning smaller molecules in custom barrel compartments.
Self-assembling apparatus: Naturally occurring ordered protein arrays along a flat plane are found in bacteria, the heart, and other muscles. Overcoming protein interaction complexities, researchers at UW Institute for Protein Design and the Janelia Research Campus of the Howard Hughes Medical Institute succeeded in programming proteins to self-assemble into novel symmetric, 2-dimensional sheets of protein lattice patterns. UW graduate student Shane Gonen in the Baker lab together with his brother Tamir Gonen at Janelia described their work in the June 19, 2015 issue of Science, “Design of ordered two-dimensional arrays mediated by non-covalent protein-protein interfaces.” This research has application in the design self-assembling protein nanomaterials, especially those that could serve as efficient sensors or light harvesters.
Precision sculpting: Protein designers are continuously refining the principles for fashioning ideal protein structures. The latest paper in the October 6, 2015 Proceedings of the National Academy of Sciences, “Control over overall shape and size in de novo designed proteins” further explains methods for systematically varying protein architecture inspired by nature. Such finesse is needed in optimizing designed proteins to take on exact shapes to perform specified functions. This work has been led by Baker lab graduate student Yu-Ru Lin in collaboration with Nobuyasu Koga at the Institute for Molecular Science in Japan.
The Institute of Protein Design has been funded by several federal agencies, including National Institutes of Health, U.S. Department of Energy, National Science Foundation, U.S. Defense Threat Reduction Agency, and U.S. Air Force Office of Scientific Research, the Washington Research Foundation, the Life Sciences Discovery Fund, as well as through private support.
The Institute also depends on a cadre of citizen scientists around the world who volunteer their personal and computer time for protein folding prediction studies through Rosetta@home and the multi-player on-line protein folding game Foldit.
A similar story was also published in UW Health Science’s Newsbeat. Read it here.
In the early 1990s, researchers in the field of protein structure prediction were challenged by the problem of how to impartially judge the accuracy of prediction algorithms. This realization led the protein structure prediction the community to start the Critical Assessment of protein Structure Prediction (CASP), a community-wide, worldwide experiment for protein structure prediction taking place every two years since 1994. In each CASP, an independent scientific advisory board solicits other researchers to submit experimentally verified, but unpublished, 3D protein structures to CASP. The linear amino acid sequences of these proteins are then provided to structure prediction researchers, who each have an equal and limited amount of time to submit final structure predictions to the CASP advisory board. The submitted structure predictions are then compared to the experimentally verified structures using the same metrics for all CASP contributors. Even though the primary goal of CASP is to help advance methods for identifying protein 3D structure given only its linear amino acid sequence, many view the experiment more as a “world championship” in protein structure prediction.
Over a 16 year period (CASP3-11), the Baker lab has consistently achieved top performance in the hardest category of structure prediction; the “Twilight Zone” where the linear amino acid sequence of the protein shares no discernable relation to any publicly available 3D structure. In 2014 this culminated in our highly accurate blind structure predictions of two large proteins each >200 amino acids in length. Our methods involve using DNA sequence information to help us predict the 3D structures of proteins.
We recently published these results in E-life, and the results are getting significant attention.
Following the groundbreaking 2014 Nature paper describing the development of a computational method to design multi-component coassembling protein nanoparticles, comes a publication in Protein Sciencefrom Baker lab graduate student Jacob Bale and collaborators. Titled “Structure of a designed tetrahedral protein assembly variant engineered to have improved soluble expression“, the paper reports a variant of a previously low yielding tetrahedral designed material for which structure determination was difficult. The new variant described in the paper had a much improved yield after redesign and the structure obtained agreed with the computational model with high atomic-level accuracy. The methods used here to improve soluble protein yield will be generally applicable to improving the yield of many designed protein nanomaterials.
Congratulations to newly minted PhDs and graduates of the Baker lab Dr. Shawn Yu and Dr. Ray Wang! Both defended their dissertations this month. Dr. Yu gave a talk on “Computational design of interleukin-2 mimetics” and Dr. Wang spoke about “Protein structure determination from cryoEM density maps”. We wish them the best of the luck in their next steps!
The annual RosettaCONmeeting was held July 29-Aug1 at the beautiful Sleeping Lady Mountain Resort in Leavenworth, WA. Many IPD scientists attended the conference, heard talks from researchers in Rosetta labs across the country, presented posters on their own research, and socialized with the larger Rosetta community.
We describe a general approach to designing two-dimensional (2D) protein arrays mediated by noncovalent protein-protein interfaces. Protein homo-oligomers are placed into one of the seventeen 2D layer groups, the degrees of freedom of the lattice are sampled to identify configurations with shape-complementary interacting surfaces, and the interaction energy is minimized using sequence design calculations. We used the method to design proteins that self-assemble into layer groups P 3 2 1, P 4 2(1) 2, and P 6. Projection maps of micrometer-scale arrays, assembled both in vitro and in vivo, are consistent with the design models and display the target layer group symmetry. Such programmable 2D protein lattices should enable new approaches to structure determination, sensing, and nanomaterial engineering.
What if scientists could design a completely new protein that is precision-tuned to bind and inhibit cancer-causing proteins in the body? Collaborating scientists at the UW Institute for Protein Design (IPD) and Molecular Engineering and Sciences Institute (MolES) have made this idea a reality with the designed protein BINDI. BINDI (BHRF1-INhibiting Design acting Intracellularly) is a completely novel protein, based on a new protein scaffold not found in nature, and designed to bind BHRF1, a protein encoded by the Epstein-Barr virus (EBV) which is responsible for disregulating cell growth towards a cancerous state. Learn more here.
A new paper is out in the June 5 issue of Nature entitled Accurate design of co-assembling multi-component protein nanomaterials. Scientists at the Institute for Protein Design (IPD), in collaboration with researchers at UCLA and HHMI, have built upon their previous work constructing single-component protein nanocages and can now design and build self-assembling protein nanomaterials made up of multiple components with near atomic-level accuracy. Learn more about this innovative work at this link.
What if scientists could design proteins to capture specific metals from our environment? The utility for cleaning up metals from waste water, soils, and our bodies could be tremendous. Dr. Jeremy Mills and collaborators in Dr. Baker’s group at the University of Washington’s Institute for Protein Design (IPD) address this challenge in the first reported use of computational protein design software, Rosetta, to engineer a new metal binding protein (“MB-07”) which incorporates an “unnatural amino acid” (UAA) to achieve very high affinity binding to metal cations. Learn more at this link.
In a widely cited Nature paper entitled Proof of principle for epitope-focused vaccine design, IPD researchers and collaborators invented a new method to design novel proteins for use as a candidate vaccines to protect against respiratory syncytial virus (RSV), a significant cause of infant mortality. Learn more at this link.
Purification of antibody IgG from crude serum or culture medium is required for virtually all research, diagnostic, and therapeutic antibody applications. Researchers at the Institute for Protein Design (IPD) have used computational methods to design a new protein (called “Fc-Binder”) that is programed to bind to the constant portion of IgG (aka “Fc” region) at basic pH (8.0) but to release the IgG at slightly acidic pH (5.5). Published on-line at PNAS (Dec. 31, 2013), the paper is entitled Computational design of a pH-sensitive IgG binding protein, co-authored by Strauch, E. – M., Fleishman S. J., & Baker D. Learn more at this link.
V-type nerve agents are among the most toxic compounds known, and are chemically related to pesticides widespread in the environment. Using an integrated approach, described in an ACS Chemical Biology paper entitled Engineering V-type nerve agents detoxifying enzymes using computationally focused libraries,Dr. Izhack Cherny, Dr. Per Greisen, and collaborators increased the rate of nerve agent detoxification by the enzyme phosphotriesterase (PTE) by 5000-fold by redesigning the active site. Learn more at this link.
Researchers in the Baker group describe an improved method for comparative modeling, RosettaCM, which optimizes a physically realistic all-atom energy function over the conformational space defined by homologous structures. Learn more at this link.
In a Journal of Molecular Biology publication entitled Computational design of a protein-based enzyme inhibitor,Dr. Erik Procko and collaborators describe the computational design of a protein-based enzyme inhibitor that binds the polar active site of hen egg lysosome (HEL). Computational design of a protein that binds polar surfaces has not been previously accomplished. Learn more at this link.
IPD researchers in the Baker group have published new computational protocols for preparing protein scaffold libraries for functional site design. Their paper entitled “A Pareto-optimal refinement method for protein design scaffolds“ improves the search for amino acids with the lowest energy subject to a set of constraints specifying function. Learn more at this link.
Dr. David Baker, Director of the IPD delivered the Centenary Award and Frederick Gowland Hopkins Memorial Lecture at at the MRC Laboratory of Molecular Biology, Cambridge, UK, on December, 13, 2012. Baker’s lecture entitled “Protein folding, structure prediction and design” can be read at this published link.
A team from David Baker’s laboratory at the University of Washington in Seattle have described a set of “rules” for the design of proteins from scratch, and have demonstrated the successful design of five new proteins that fold reliably into predicted conformations. Their work was published Nature. Learn more at this link.
As reported in Nature Biotechnology, David Baker and scientists at the IPD published exciting new methods to improve the potency and breadth of computer-designed protein inhibitors of influenza. Learn more at this link.
IPD researchers in the Baker group have published in Science a paper entitled “Computational design of self-assembling protein nanomaterials with atomic level accuracy.” They describe a general computational method for designing proteins that self-assemble to a desired symmetric architecture. Protein building blocks are docked together symmetrically to identify complementary packing arrangements, and low-energy protein-protein interfaces are then designed between the building blocks in order to drive self-assembly. Read more at this link.