We develop advanced tools for protein design and use them to create custom molecules. By studying millions of designed proteins in the lab, we’re always discovering new ways to improve our design methodology.
These tools — which can be used to make medicines, enzymes, and more — are freely available to the scientific community.
A generative model for protein design
RoseTTAFold Diffusion is a guided diffusion model that can be used to generate protein structures in seconds. As reported in Nature, it excels at producing protein monomers, oligomers, binders, and more.
With prior methods, thousands of designs may have to be tested in the lab before finding a single one that performs as intended. Using RFdiffusion, we find that as little as one protein per design challenge must be tested.
To create RFdiffusion, our scientists drew inspiration from AI image generation tools like DALL-E. It was trained to iteratively remove noise from clouds of disconnected atoms, rearranging them into novel protein backbones.
ProteinMPNN is a powerful tool for protein sequence design. As reported in Science, it takes a protein structure as input and quickly identifies new amino acid sequences that are likely to fold into that backbone. ProteinMPNN runs in about one second, which is more than 200 times faster than our previous best software. Its results are superior to prior tools, and it requires no expert customization to run. Combined with structure prediction tools such as RoseTTAFold or AlphaFold, ProteinMPNN can be used to create stable proteins with novel structures and sequences.
Predict protein structures from amino acid sequences
RoseTTAFold uses deep learning to quickly and accurately predict protein structures based on amino acid sequences alone. Without the aid of such software, it can take years of laboratory work to determine the structure of just one protein. With RoseTTAFold, a protein structure can be computed in as little as ten minutes on a single gaming computer.
RoseTTAFold is a three-track neural network, meaning it simultaneously considers patterns in protein sequences, how a protein’s amino acids interact with one another, and a protein’s possible three-dimensional structure. In this architecture, one-, two-, and three-dimensional information flows back and forth, allowing the network to collectively reason about the relationship between a protein’s chemical parts and its folded structure.
As reported in Science, our team has used RoseTTAFold to compute hundreds of new protein structures, including many poorly understood proteins from the human genome. We also generated structures directly relevant to human health, including for proteins associated with problematic lipid metabolism, inflammation disorders, and cancer cell growth. And we have shown that RoseTTAFold can be used to build models of complex biological assemblies in a fraction of the time previously required.
The Rosetta software suite includes algorithms for computational modeling and analysis of protein structures. Rosetta has enabled notable scientific advances in computational biology, including de novo protein design, enzyme design, ligand docking, and structure prediction of biological macromolecules and macromolecular complexes.
Rosetta began in the laboratory of David Baker at the University of Washington as a tool for predicting protein structures. It has since been adapted to solve many different computational macromolecular problems. Development of Rosetta now happens among the members of RosettaCommons, which include government laboratories, institutes, research centers, and partner corporations.
Rosetta is available to all non-commercial users for free and to commercial users for a fee. Visit rosettacommon.org to get started.
A free video game
Computers are smart, but they sometimes miss important things. The same goes for researchers. This is where Foldit comes in: by playing our free desktop game, anyone can help craft new proteins using their unique creativity and ingenuity.