Summary of Online Policy Learning and Inference by Matrix Completion, By Congyuan Duan et al.

Online Policy Learning and Inference by Matrix Completion

by Congyuan Duan, Jingyang Li, Dong Xia

First submitted to arxiv on: 26 Apr 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper presents a collaborative-filtering approach for decision-making in situations where personalized covariates are unavailable. The authors propose a matrix completion bandit framework that assumes low-dimensional latent features. They introduce a policy learning procedure combining an -greedy policy and online gradient descent algorithm, with a novel two-phase design balancing policy learning accuracy and regret performance. Additionally, the paper develops an online debiasing method based on inverse propensity weighting, which is shown to have asymptotic normality. The proposed methods are applied to data from the San Francisco parking pricing project, leading to intriguing discoveries that outperform benchmark policies.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper shows how computers can make good choices when they don’t know people’s individual preferences. The authors develop a new way of doing this using a type of machine learning called collaborative filtering. They test their approach on real-world data from San Francisco and find that it works well. This could be useful in many situations where people want to make collective decisions, but don’t have all the information they need.

Keywords

* Artificial intelligence * Gradient descent * Machine learning

Online Policy Learning and Inference by Matrix Completion

by Congyuan Duan, Jingyang Li, Dong Xia

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of M3bat: Unsupervised Domain Adaptation For Multimodal Mobile Sensing with Multi-branch Adversarial Training, by Lakmal Meegahapola et al.

Summary of Separation Capacity Of Linear Reservoirs with Random Connectivity Matrix, by Youness Boutaib

Related Posts