Summary of Longreward: Improving Long-context Large Language Models with Ai Feedback, by Jiajie Zhang et al.
LongReward: Improving Long-context Large Language Models with AI Feedbackby Jiajie Zhang, Zhongni Hou, Xin Lv,…
LongReward: Improving Long-context Large Language Models with AI Feedbackby Jiajie Zhang, Zhongni Hou, Xin Lv,…
Capacity-Aware Planning and Scheduling in Budget-Constrained Monotonic MDPs: A Meta-RL Approachby Manav Vora, Ilan Shomorony,…
BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep Neural Network Inferenceby Changwoo Lee, Soo Min…
Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Modelsby Weijian Luo, Colin Zhang, Debing Zhang, Zhengyang GengFirst…
Constrained Optimal Fuel Consumption of HEV:Considering the Observational Perturbationby Shuchang Yan, Haoran SunFirst submitted to…
FACTS: A Factored State-Space Framework For World Modellingby Li Nanbo, Firas Laakom, Yucheng Xu, Wenyi…
Neural Hamilton: Can A.I. Understand Hamiltonian Mechanics?by Tae-Geun Kim, Seong Chan ParkFirst submitted to arxiv…
Neuro-symbolic Learning Yielding Logical Constraintsby Zenan Li, Yunpeng Huang, Zhaoyu Li, Yuan Yao, Jingwei Xu,…
DeTeCtive: Detecting AI-generated Text via Multi-Level Contrastive Learningby Xun Guo, Shan Zhang, Yongxin He, Ting…
BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacksby Yunhan Zhao, Xiang Zheng, Lin…