Thursday, December 11, 2025

A trio of AI methods tackles enzyme design



Naturally occurring enzymes, while powerful, catalyze only a fraction of the reactions chemists care about. That’s why scientists are eager to design new-to-nature versions that could manufacture drugs more efficiently, break down pollutants, capture carbon, or carry out entirely new forms of chemistry that biology never evolved. But doing this requires placing catalytic residues with a lot of precision something that has been extremely hard to achieve computationally.

Three new papers now report improvements to enzyme design using diffusion models, a class of generative artificial intelligence algorithms that add and then subtract noise. The models RFdiffusion2, RFdiffusion3, and Riff-Diff focus on different barriers in enzyme design but all have the same outcome: computationally generated proteins that successfully carry out reactions.

“The challenge in enzyme design is placing the groups that are going to carry out the catalysis really precisely in 3-dimensional space,” Baker says.

Earlier methods required researchers to manually specify both the identity and the sequence position of catalytic residues before the rest of the protein could be built. This manual step restricted how broadly designers could search for scaffold solutions, limiting both efficiency and creativity in the final designs, says Seth Woodbury, a graduate student in Baker’s laboratory group who worked on RFdiffusion2.

To avoid this problem, the team instead starts with a cluster of atoms arranged in the ideal shape needed for the reaction. From that, RFdiffusion2 figures out where the catalytic residue should be placed in the protein sequence, where surrounding amino acids should go, and how the backbone should bend around them.

“The more freedom that you give these networks, the more that you let them address the design problem in its purest form, the most creative and viable solutions it can come up with,” Woodbury says.

Naturally occurring enzymes, while powerful, catalyze only a fraction of the reactions chemists care about. That’s why scientists are eager to design new-to-nature versions that could manufacture drugs more efficiently, break down pollutants, capture carbon, or carry out entirely new forms of chemistry that biology never evolved. But doing this requires placing catalytic residues with a lot of precision something that has been extremely hard to achieve computationally.

Three new papers now report improvements to enzyme design using diffusion models, a class of generative artificial intelligence algorithms that add and then subtract noise. The models RFdiffusion2, RFdiffusion3, and Riff-Diff focus on different barriers in enzyme design but all have the same outcome: computationally generated proteins that successfully carry out reactions.

“The challenge in enzyme design is placing the groups that are going to carry out the catalysis really precisely in 3-dimensional space,” Baker says.

Earlier methods required researchers to manually specify both the identity and the sequence position of catalytic residues before the rest of the protein could be built. This manual step restricted how broadly designers could search for scaffold solutions, limiting both efficiency and creativity in the final designs, says Seth Woodbury, a graduate student in Baker’s laboratory group who worked on RFdiffusion2.

To avoid this problem, the team instead starts with a cluster of atoms arranged in the ideal shape needed for the reaction. From that, RFdiffusion2 figures out where the catalytic residue should be placed in the protein sequence, where surrounding amino acids should go, and how the backbone should bend around them.

“The more freedom that you give these networks, the more that you let them address the design problem in its purest form, the most creative and viable solutions it can come up with,” Woodbury says.

The researchers tested their approach by designing metallohydrolases enzymes that use a metal ion, often zinc, to help break chemical bonds. These enzymes work only if the metal is held in the right place by nearby residues, which are arranged with very precise spacing. When the designed proteins were made and tested in the lab, some of the computer-generated enzymes exhibited activity, although the turnover numbers remained lower than those of their natural counterparts.

“We were really just shocked, because on the first order that we placed and tested, we found some extremely active enzymes, some that were many orders of magnitude better than previous designs,” Woodbury says.

But the Baker lab didn’t stop with RFdiffusion2. In a second, not-yet-peer-reviewed paper, the researchers modify the method to design proteins alongside the molecules they interact with (BioRxiv 2025, DOI: 10.1101/2025.09.18.676967). The result is RFdiffusion3, which builds a protein, and any molecules it interacts with, at the atomic level rather than at the residue level. Baker says this helps avoid issues that can come when the protein is designed first, before the binding partner, such as misfit pockets, wrong orientations, or unrealistic chemistry.

Several RFdiffusion3-designed proteins have behaved as intended: some bind their ligands with the expected geometry; others recognize specific DNA shapes, and a few show catalytic activity.

“Everything is becoming more automatic, and the scope of designs goes beyond proteins and now includes RNA, DNA and the binding of small molecules,” Kendall Houk, an organic chemist at the University of California, Berkeley, who was not involved in the Baker lab’s work, says in an email to C&EN. While the success rate of the RFdiffusion2- and RFdiffusion3-designed enzymes was low, the work is an important advancement toward protein and enzyme design, he says.

The Baker lab isn’t the only team releasing new enzyme design models this month. Gustav Oberdorfer’s group at the Graz University of Technology has launched Riff-Diff, which pairs diffusion models with engineered catalytic motifs (Nature 2025, DOI: 10.1038/s41586-025-09747-9). These motifs are small structural fragments where the key catalytic residues are already arranged in the exact geometry needed for the reaction. Instead of borrowing motifs from natural enzymes, the team builds artificial versions embedded in α-helical segments, where the substrate would bind.

By temporarily placing a helix in the binding site, Oberdorfer says, Riff-Diff is forced to construct a deeper, more structured pocket. The α-helix support is then removed after enzyme generation and replaced with the intended substrate models.

“What Riff-Diff does is it breaks down the enzyme problem to a motif-scaffolding problem, because what we’ve noticed over the last couple of years is that what you really need is a pretty much perfectly preorganized active site,” Oberdorfer says.

The Graz team used Riff-Diff to generate dozens of enzyme designs for a retro-aldol reaction and for the Morita-Baylis-Hillman reaction, which requires a coordinated series of nucleophilic and proton-transfer steps. When the researchers tested these proteins in vitro, a large fraction produced detectable amounts of product and worked faster than other generated enzymes.

But none of the models is perfect, and both teams are looking to make improvements that would increase the speed and selectivity of computer-generated enzymes.

“Overall, the major limitation to all of these approaches, including Riff-Diff, is our fundamental understanding of what is truly important in the catalytic step,” Oberdorfer says.

#AnalyticalChemistry, #ScienceOfSolutions, #ChemicalAnalysis, #Spectroscopy, #Chromatography, #LabScience, #PrecisionMatters, #ScienceInEveryDrop, #ChemistryMatters, #InnovationThroughAnalysis

For More Details

🌎Visit Our Website : analyticalchemistry.org

✉️Contact Us: mail@analyticalchemistry.org

Get Connected Here:
=====================

No comments:

Post a Comment