Summary of Rlsf: Reinforcement Learning Via Symbolic Feedback, by Piyush Jha et al.
RLSF: Reinforcement Learning via Symbolic Feedbackby Piyush Jha, Prithwish Jana, Pranavkrishna Suresh, Arnav Arora, Vijay…
RLSF: Reinforcement Learning via Symbolic Feedbackby Piyush Jha, Prithwish Jana, Pranavkrishna Suresh, Arnav Arora, Vijay…
LoQT: Low-Rank Adapters for Quantized Pretrainingby Sebastian Loeschcke, Mads Toftrup, Michael J. Kastoryano, Serge Belongie,…
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizerby Zhihan Liu,…
Multi-Reference Preference Optimization for Large Language Modelsby Hung Le, Quan Tran, Dung Nguyen, Kien Do,…
KG-FIT: Knowledge Graph Fine-Tuning Upon Open-World Knowledgeby Pengcheng Jiang, Lang Cao, Cao Xiao, Parminder Bhatia,…
MindStar: Enhancing Math Reasoning in Pre-trained LLMs at Inference Timeby Jikun Kang, Xin Zhe Li,…
A Second-Order Perspective on Model Compositionality and Incremental Learningby Angelo Porrello, Lorenzo Bonicelli, Pietro Buzzega,…
A transfer learning framework for weak-to-strong generalizationby Seamus Somerstep, Felipe Maia Polo, Moulinath Banerjee, Ya'acov…
Feature Protection For Out-of-distribution Generalizationby Lu Tan, Huei Zhou, Yinxiang Huang, Zeming Zheng, Yujiu YangFirst…
SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large Language Modelsby Xudong Lu, Aojun Zhou, Yuhui Xu, Renrui…