Summary of A2po: Towards Effective Offline Reinforcement Learning From An Advantage-aware Perspective, by Yunpeng Qing et al.
A2PO: Towards Effective Offline Reinforcement Learning from an Advantage-aware Perspectiveby Yunpeng Qing, Shunyu liu, Jingyuan…