Summary of Shap Zero Explains Genomic Models with Near-zero Marginal Cost For Future Queried Sequences, by Darin Tsui and Aryan Musharaf and Yigit Efe Erginbas and Justin Singh Kang and Amirali Aghazadeh
SHAP zero Explains Genomic Models with Near-zero Marginal Cost for Future Queried Sequences
by Darin Tsui, Aryan Musharaf, Yigit Efe Erginbas, Justin Singh Kang, Amirali Aghazadeh
First submitted to arxiv on: 25 Oct 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Computational Engineering, Finance, and Science (cs.CE); Genomics (q-bio.GN); Computation (stat.CO)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A novel approach to explaining machine learning models in genomics is presented, focusing on Shapley values. While these values provide local explanations for individual input sequences, global explanations across thousands of sequences require significant computational resources and carbon footprint. To address this challenge, the authors introduce SHAP zero, a method that estimates Shapley values and interactions with near-zero marginal cost for future queried sequences after an initial model sketching process. This is achieved by connecting Shapley values to the Fourier transform of the model. The authors demonstrate the effectiveness of SHAP zero on two genomic models, achieving orders-of-magnitude reduction in amortized computational cost compared to state-of-the-art algorithms and revealing previously inaccessible predictive motifs. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Machine learning models are helping scientists understand how genes work. These models make predictions about what will happen when certain genetic instructions are followed. To understand why these models are making the predictions they do, we need a way to explain them. One way is by using Shapley values, which give us information about each part of the model’s prediction. But this method has a problem: it’s very slow and uses up a lot of computer power and energy. To solve this problem, scientists have developed a new method called SHAP zero. This method allows us to quickly explain why a model is making certain predictions without using up too much computer power or energy. |
Keywords
* Artificial intelligence * Machine learning