Loading Now

Summary of Chimera: Accurate Retrosynthesis Prediction by Ensembling Models with Diverse Inductive Biases, By Krzysztof Maziarz et al.


Chimera: Accurate retrosynthesis prediction by ensembling models with diverse inductive biases

by Krzysztof Maziarz, Guoqing Liu, Hubert Misztela, Aleksei Kornev, Piotr Gaiński, Holger Hoefling, Mike Fortunato, Rishi Gupta, Marwin Segler

First submitted to arxiv on: 6 Dec 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI); Quantitative Methods (q-bio.QM)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper proposes Chimera, a novel framework for building highly accurate reaction models in chemical synthesis planning. The authors highlight the limitations of existing machine learning (ML) approaches, which are constrained by the accuracy of retrosynthesis prediction. Inspired by chemists’ ideation strategies, Chimera combines predictions from diverse sources with complementary inductive biases using a learning-based ensembling strategy. The framework outperforms major models across multiple data scales and time-splits, achieving state-of-the-art performance. Moreover, PhD-level organic chemists prefer Chimera’s predictions over baselines in terms of quality. Finally, the authors demonstrate robust generalization under distribution shift by transferring their largest-scale checkpoint to an internal dataset from a major pharmaceutical company.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper helps solve a big problem in making new medicines. Chemists need to find the best way to combine small molecules to create new ones, but this process is very slow and limited. The authors created a new tool called Chimera that combines different predictions to make more accurate reactions. This means chemists can now quickly and accurately come up with ideas for creating new medicines. The tool is already better than existing methods and works well even when it’s used with large amounts of data.

Keywords

* Artificial intelligence  * Generalization  * Machine learning