Loading Now

Summary of Improving Inverse Folding For Peptide Design with Diversity-regularized Direct Preference Optimization, by Ryan Park et al.


Improving Inverse Folding for Peptide Design with Diversity-regularized Direct Preference Optimization

by Ryan Park, Darren J. Hsu, C. Brian Roland, Maria Korshunova, Chen Tessler, Shie Mannor, Olivia Viessmann, Bruno Trentini

First submitted to arxiv on: 25 Oct 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
A novel approach for predicting amino acid sequences that fold into desired reference structures is proposed. The ProteinMPNN model, a message-passing encoder-decoder model, is fine-tuned using Direct Preference Optimization (DPO) to generate diverse and structurally consistent peptide sequences. Two enhancements are introduced: online diversity regularization and domain-specific priors. These improvements enable the model to achieve state-of-the-art structural similarity scores when conditioned on OpenFold generated structures, outperforming the base ProteinMPNN by at least 8%. Additionally, a new understanding is gained on improving diversity in decoder models.
Low GrooveSquid.com (original content) Low Difficulty Summary
Inverse folding models help design proteins by predicting amino acid sequences that fold into desired shapes. The ProteinMPNN model is good at making predictions for full-length proteins, but it has trouble creating diverse and correct peptide sequences. To fix this, researchers improved the model using a technique called Direct Preference Optimization (DPO). They added two new tools to DPO: one helps keep the generated sequences diverse, while the other makes sure the sequences are specific to certain protein types. This improved approach worked better than the original ProteinMPNN and can generate peptides that fold into desired shapes.

Keywords

» Artificial intelligence  » Decoder  » Encoder decoder  » Optimization  » Regularization