Summary of Entropy Controllable Direct Preference Optimization, by Motoki Omura et al.
Entropy Controllable Direct Preference Optimizationby Motoki Omura, Yasuhiro Fujita, Toshiki KataokaFirst submitted to arxiv on:…
Entropy Controllable Direct Preference Optimizationby Motoki Omura, Yasuhiro Fujita, Toshiki KataokaFirst submitted to arxiv on:…
Exploring Multi-Agent Reinforcement Learning for Unrelated Parallel Machine Schedulingby Maria Zampella, Urtzi Otamendi, Xabier Belaunzaran,…
Efficient Adaptive Optimization via Subset-Norm and Subspace-Momentum: Fast, Memory-Reduced Training with Convergence Guaranteesby Thien Hang…
WassFFed: Wasserstein Fair Federated Learningby Zhongxuan Han, Li Zhang, Chaochao Chen, Xiaolin Zheng, Fei Zheng,…
Non-Adversarial Inverse Reinforcement Learning via Successor Feature Matchingby Arnav Kumar Jain, Harley Wiltzer, Jesse Farebrother,…
General framework for online-to-nonconvex conversion: Schedule-free SGD is also effective for nonconvex optimizationby Kwangjun Ahn,…
Scientific machine learning in ecological systems: A study on the predator-prey dynamicsby Ranabir Devgupta, Raj…
Neuromodulated Meta-Learningby Jingyao Wang, Huijie Guo, Wenwen Qiang, Jiangmeng Li, Changwen Zheng, Hui Xiong, Gang…
Meta-Learning Objectives for Preference Optimizationby Carlo Alfano, Silvia Sapora, Jakob Nicolaus Foerster, Patrick Rebeschini, Yee…
An Energy-Based Self-Adaptive Learning Rate for Stochastic Gradient Descent: Enhancing Unconstrained Optimization with VAV methodby…