Loading Now

Summary of Online Policy Learning and Inference by Matrix Completion, By Congyuan Duan et al.


Online Policy Learning and Inference by Matrix Completion

by Congyuan Duan, Jingyang Li, Dong Xia

First submitted to arxiv on: 26 Apr 2024

Categories

  • Main: Machine Learning (stat.ML)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper presents a collaborative-filtering approach for decision-making in situations where personalized covariates are unavailable. The authors propose a matrix completion bandit framework that assumes low-dimensional latent features. They introduce a policy learning procedure combining an -greedy policy and online gradient descent algorithm, with a novel two-phase design balancing policy learning accuracy and regret performance. Additionally, the paper develops an online debiasing method based on inverse propensity weighting, which is shown to have asymptotic normality. The proposed methods are applied to data from the San Francisco parking pricing project, leading to intriguing discoveries that outperform benchmark policies.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper shows how computers can make good choices when they don’t know people’s individual preferences. The authors develop a new way of doing this using a type of machine learning called collaborative filtering. They test their approach on real-world data from San Francisco and find that it works well. This could be useful in many situations where people want to make collective decisions, but don’t have all the information they need.

Keywords

» Artificial intelligence  » Gradient descent  » Machine learning