Summary of Online Poisoning Attack Against Reinforcement Learning Under Black-box Environments, by Jianhui Li et al.
Online Poisoning Attack Against Reinforcement Learning under Black-box Environments
by Jianhui Li, Bokang Zhang, Junfeng Wu
First submitted to arxiv on: 1 Dec 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Cryptography and Security (cs.CR)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper proposes an online environment poisoning algorithm for reinforcement learning agents operating in black-box settings, where an adversary manipulates training data to lead the agent toward a mischievous policy. The algorithm, which focuses on unknown environment dynamics and flexible reinforcement learning algorithms, first proposes an attack scheme that poisons reward functions and state transitions. A constrained optimization problem is formalized using the framework of [Ma et al., 2019]. To address unknown transition probabilities in black-box environments, a stochastic gradient descent algorithm is applied with sample-based estimates of exact gradients. The algorithm’s effectiveness is validated through experiments in a maze environment. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper helps us understand how an attacker can manipulate training data to make a machine learning agent do the wrong thing. In this case, the agent is trying to learn how to navigate a maze, but the attacker is trying to trick it into taking the wrong path. The researchers propose a new way to attack the system and show that it works by testing it in a simulated maze. |
Keywords
» Artificial intelligence » Machine learning » Optimization » Reinforcement learning » Stochastic gradient descent