Summary of Keep Guessing? When Considering Inference Scaling, Mind the Baselines, by Gal Yona et al.
Keep Guessing? When Considering Inference Scaling, Mind the Baselines
by Gal Yona, Or Honovich, Omer Levy, Roee Aharoni
First submitted to arxiv on: 20 Oct 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper investigates the efficacy of repeated inference computation in large language models (LLMs) to improve problem-solving capabilities. By analyzing the distribution of answers in standard evaluation benchmarks, researchers found that the answer distribution is skewed towards a small set of common answers, which explains why repeated sampling increases coverage as the number of samples grows. The study proposes a baseline approach that enumerates answers according to their prevalence in the training set and compares it with repeated model sampling and a mixture strategy. Results across two domains – mathematical reasoning and factual knowledge – show that the proposed baseline outperforms repeated model sampling for some LLMs, while maintaining similar coverage levels for others. This work provides insights into how repeated inference computation can improve problem-solving capabilities in LLMs. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Large language models (LLMs) are super smart computers that can solve many problems. Researchers want to know if these models get better at solving problems when they look at lots of different answers instead of just one. They found out that most answers to the same question are similar, which is why looking at more answers helps. The researchers created a special way to find answers by counting how often each answer appears in the training data. This new approach worked as well or even better than the model’s original method for some models and was just as good for others. This study helps us understand how LLMs can solve problems better. |
Keywords
» Artificial intelligence » Inference