Summary of Improving Small-scale Large Language Models Function Calling For Reasoning Tasks, by Graziano A. Manduzio et al.
Improving Small-Scale Large Language Models Function Calling for Reasoning Tasksby Graziano A. Manduzio, Federico A.…
Improving Small-Scale Large Language Models Function Calling for Reasoning Tasksby Graziano A. Manduzio, Federico A.…
Reversal of Thought: Enhancing Large Language Models with Preference-Guided Reverse Reasoning Warm-upby Jiahao Yuan, Dehui…
Increasing the Difficulty of Automatically Generated Questions via Reinforcement Learning with Synthetic Preferenceby William Thorne,…
The Accuracy Paradox in RLHF: When Better Reward Models Don’t Yield Better Language Modelsby Yanjun…
CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoningby Huimu Yu, Xing Wu, Weidong…
Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Modelsby Angela…
The Phenomenology of Machine: A Comprehensive Analysis of the Sentience of the OpenAI-o1 Model Integrating…
Just Say What You Want: Only-prompting Self-rewarding Online Preference Optimizationby Ruijie Xu, Zhihan Liu, Yongfei…
Post-hoc Reward Calibration: A Case Study on Length Biasby Zeyu Huang, Zihan Qiu, Zili Wang,…