Summary of Improving Inverse Folding For Peptide Design with Diversity-regularized Direct Preference Optimization, by Ryan Park et al.
Improving Inverse Folding for Peptide Design with Diversity-regularized Direct Preference Optimization
by Ryan Park, Darren J. Hsu, C. Brian Roland, Maria Korshunova, Chen Tessler, Shie Mannor, Olivia Viessmann, Bruno Trentini
First submitted to arxiv on: 25 Oct 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A novel approach for predicting amino acid sequences that fold into desired reference structures is proposed. The ProteinMPNN model, a message-passing encoder-decoder model, is fine-tuned using Direct Preference Optimization (DPO) to generate diverse and structurally consistent peptide sequences. Two enhancements are introduced: online diversity regularization and domain-specific priors. These improvements enable the model to achieve state-of-the-art structural similarity scores when conditioned on OpenFold generated structures, outperforming the base ProteinMPNN by at least 8%. Additionally, a new understanding is gained on improving diversity in decoder models. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Inverse folding models help design proteins by predicting amino acid sequences that fold into desired shapes. The ProteinMPNN model is good at making predictions for full-length proteins, but it has trouble creating diverse and correct peptide sequences. To fix this, researchers improved the model using a technique called Direct Preference Optimization (DPO). They added two new tools to DPO: one helps keep the generated sequences diverse, while the other makes sure the sequences are specific to certain protein types. This improved approach worked better than the original ProteinMPNN and can generate peptides that fold into desired shapes. |
Keywords
» Artificial intelligence » Decoder » Encoder decoder » Optimization » Regularization