Summary of Robot Policy Learning with Temporal Optimal Transport Reward, by Yuwei Fu et al.
Robot Policy Learning with Temporal Optimal Transport Rewardby Yuwei Fu, Haichao Zhang, Di Wu, Wei…
Robot Policy Learning with Temporal Optimal Transport Rewardby Yuwei Fu, Haichao Zhang, Di Wu, Wei…
Dimensionality-induced information loss of outliers in deep neural networksby Kazuki Uematsu, Kosuke Haruki, Taiji Suzuki,…
Towards Multi-dimensional Explanation Alignment for Medical Classificationby Lijie Hu, Songning Lai, Wenshuo Chen, Hongru Xiao,…
L3Ms – Lagrange Large Language Modelsby Guneet S. Dhillon, Xingjian Shi, Yee Whye Teh, Alex…
UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA through a Generalized Implicit Reward Functionby Zhichao Wang,…
Flaming-hot Initiation with Regular Execution Sampling for Large Language Modelsby Weizhe Chen, Zhicheng Zhang, Guanlin…
Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Modelsby Weijian Luo, Colin Zhang, Debing Zhang, Zhengyang GengFirst…
Physics-informed Partitioned Coupled Neural Operator for Complex Networksby Weidong Wu, Yong Zhang, Lili Hao, Yang…
Fidelity-Imposed Displacement Editing for the Learn2Reg 2024 SHG-BF Challengeby Jiacheng Wang, Xiang Chen, Renjiu Hu,…
Shopping MMLU: A Massive Multi-Task Online Shopping Benchmark for Large Language Modelsby Yilun Jin, Zheng…