RoseTTAFold Diffusion (RFdiffusion) is a guided diffusion model for generating new protein structures. As reported in Nature, it excels at a broad range of backbone design challenges, including monomer design, oligomer design, binder design, and more.
With prior methods, tens of thousands of computer-generated proteins may have to be tested in the lab before finding a single one that performs as intended. Using RFdiffusion, we find that as little as one protein per design challenge must be tested.
Drawing inspiration from AI image tools like DALL-E, which use guided diffusion models to craft never-before-seen images, RFdiffusion was trained to iteratively remove noise from clouds of disconnected atoms, rearranging them into defined and novel protein structures.
ProteinMPNN is a powerful tool for protein sequence design. As reported in Science, it takes a protein structure as input and quickly identifies new amino acid sequences that are likely to fold into that backbone. ProteinMPNN runs in about one second, which is more than 200 times faster than our previous best software. Its results are superior to prior tools, and it requires no expert customization to run. Combined with structure prediction tools such as RoseTTAFold or AlphaFold, ProteinMPNN can be used to create stable proteins with novel structures and sequences.
Predict protein structures from amino acid sequences
RoseTTAFold uses deep learning to quickly and accurately predict protein structures based on amino acid sequences alone. Without the aid of such software, it can take years of laboratory work to determine the structure of just one protein. With RoseTTAFold, a protein structure can be computed in as little as ten minutes on a single gaming computer.
RoseTTAFold is a three-track neural network, meaning it simultaneously considers patterns in protein sequences, how a protein’s amino acids interact with one another, and a protein’s possible three-dimensional structure. In this architecture, one-, two-, and three-dimensional information flows back and forth, allowing the network to collectively reason about the relationship between a protein’s chemical parts and its folded structure.
As reported in Science, our team has used RoseTTAFold to compute hundreds of new protein structures, including many poorly understood proteins from the human genome. We also generated structures directly relevant to human health, including for proteins associated with problematic lipid metabolism, inflammation disorders, and cancer cell growth. And we have shown that RoseTTAFold can be used to build models of complex biological assemblies in a fraction of the time previously required.
The Rosetta software suite includes algorithms for computational modeling and analysis of protein structures. Rosetta has enabled notable scientific advances in computational biology, including de novo protein design, enzyme design, ligand docking, and structure prediction of biological macromolecules and macromolecular complexes.
Rosetta began in the laboratory of Dr. David Baker at the University of Washington as a structure prediction tool but has since been adapted to solve common computational macromolecular problems. Development of Rosetta now happens among the members of RosettaCommons, which include government laboratories, institutes, research centers, and partner corporations.
Rosetta is available to all non-commercial users for free and to commercial users for a fee. Visit rosettacommon.org to get started.
A free video game
Computers are smart, but they sometimes miss important things. The same goes for researchers. This is where Foldit comes in: by playing this free desktop game, anyone can help craft new proteins using their unique creativity and ingenuity.