Loading Now

Summary of Feel-good Thompson Sampling For Contextual Dueling Bandits, by Xuheng Li et al.


Feel-Good Thompson Sampling for Contextual Dueling Bandits

by Xuheng Li, Heyang Zhao, Quanquan Gu

First submitted to arxiv on: 9 Apr 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Optimization and Control (math.OC); Machine Learning (stat.ML)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper proposes a Thompson sampling algorithm for linear contextual dueling bandits, which extends classic dueling bandits to incorporate contextual information. The proposed algorithm, named http://FGTS.CDB, leverages the independence of the two selected arms and incorporates a new exploration term tailored for dueling bandits. The algorithm achieves nearly minimax-optimal regret, with a bound of (dT), where d is the model dimension and T is the time horizon. Experimental results on synthetic data show that http://FGTS.CDB outperforms existing algorithms by a significant margin.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper introduces a new way to make decisions based on information about the context. It’s called contextual dueling bandits, and it compares two options based on what’s happening around us. The researchers developed an algorithm that helps us choose between these options in a smart way, taking into account what we’ve learned so far. They tested their algorithm with synthetic data and found that it works much better than previous methods.

Keywords

* Artificial intelligence  * Synthetic data