Summary of A General Framework For Clustering and Distribution Matching with Bandit Feedback, by Recep Can Yavas et al.
A General Framework for Clustering and Distribution Matching with Bandit Feedback
by Recep Can Yavas, Yuqi Huang, Vincent Y. F. Tan, Jonathan Scarlett
First submitted to arxiv on: 8 Sep 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Information Theory (cs.IT); Machine Learning (stat.ML)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary We present a general framework for clustering and distribution matching problems with bandit feedback. Our K-armed bandit model partitions some subset of arms into M groups, where each group’s random variables follow the same finite-alphabet distributions. The decision maker pulls arms based on past outcomes without knowing the arm distributions or underlying partitions. We design an online algorithm using Track-and-Stop and Frank-Wolfe methods to minimize average arm pulls while keeping error probability below . Our framework encompasses existing problems like M pairs of arms, odd arm identification, and N-ary clustering. We derive a non-asymptotic lower bound on average arm pulls for any algorithm with error probability not exceeding , and our algorithm asymptotically matches this bound. Our refined analysis reveals a novel convergence bound as vanishes. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary We’re developing a new way to solve problems where we don’t know the rules or patterns, but can learn from trial and error. Imagine you have K choices, but some of them are grouped together in certain ways, and each group has its own rules for what happens when you make a choice. You want to figure out which groups the choices belong to without knowing the rules beforehand. Our method uses a combination of existing techniques to solve this problem efficiently and accurately. We’ve shown that our approach is better than other methods in terms of how many attempts it takes to get the right answer, and we’re excited about the possibilities for applying this technique to real-world problems. |
Keywords
» Artificial intelligence » Clustering » Probability