Summary of Smoothie: Label Free Language Model Routing, by Neel Guha et al.
Smoothie: Label Free Language Model Routing
by Neel Guha, Mayee F. Chen, Trevor Chow, Ishan S. Khare, Christopher Ré
First submitted to arxiv on: 6 Dec 2024
Categories
- Main: Artificial Intelligence (cs.AI)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Large language models are being used in various applications, where inputs may involve multiple tasks. Research has shown that the choice of LLM is crucial, as different models perform well on different input samples. Existing approaches focus on selecting an optimal model for each sample through training auxiliary models on human-annotated data. Our work explores unsupervised routing, proposing Smoothie, a weak supervision-inspired method that doesn’t require labeled data. Given outputs from multiple LLMs, Smoothie constructs a graphical model over the embedding representations of observed and true outputs. This allows us to estimate sample-dependent quality scores for each LLM, routing samples to the highest-scoring model. Our results show that Smoothie’s quality scores correlate with ground-truth model quality (correctly identifying the optimal model on 9/14 tasks) and outperforms baselines by up to 10 points accuracy. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This research is about using special language models in different ways. Right now, these models are used for many things, but they might not be the best choice every time. The problem is that it’s hard to decide which model to use without looking at what each one does well. Our team came up with a new way to solve this called Smoothie. It doesn’t need any special training data like other methods do. Instead, it looks at how different models perform on different tasks and picks the best one for each job. We tested it and found that it works really well, correctly choosing the right model most of the time. |
Keywords
» Artificial intelligence » Embedding » Unsupervised