Summary of Copr: Continual Human Preference Learning Via Optimal Policy Regularization, by Han Zhang et al.
COPR: Continual Human Preference Learning via Optimal Policy Regularizationby Han Zhang, Lin Gui, Yu Lei,…
COPR: Continual Human Preference Learning via Optimal Policy Regularizationby Han Zhang, Lin Gui, Yu Lei,…
Overcoming Saturation in Density Ratio Estimation by Iterated Regularizationby Lukas Gruber, Markus Holzleitner, Johannes Lehner,…
Dealing with unbounded gradients in stochastic saddle-point optimizationby Gergely Neu, Nneka OkoloFirst submitted to arxiv…
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Modelsby Chenyang Song, Xu Han,…
Causal hybrid modeling with double machine learningby Kai-Hendrik Cohrs, Gherardo Varando, Nuno Carvalhais, Markus Reichstein,…
Towards Robust Graph Incremental Learning on Evolving Graphsby Junwei Su, Difan Zou, Zijun Zhang, Chuan…
Improving Neural-based Classification with Logical Background Knowledgeby Arthur Ledaguenel, CĂ©line Hudelot, Mostepha KhouadjiaFirst submitted to…
A Bound on the Maximal Marginal Degrees of Freedomby Paul DommelFirst submitted to arxiv on:…
Structural Knowledge Informed Continual Multivariate Time Series Forecastingby Zijie Pan, Yushan Jiang, Dongjin Song, Sahil…
SDEs for Minimax Optimizationby Enea Monzio Compagnoni, Antonio Orvieto, Hans Kersting, Frank Norbert Proske, Aurelien…