Summary of Sftmix: Elevating Language Model Instruction Tuning with Mixup Recipe, by Yuxin Xiao et al.
SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipeby Yuxin Xiao, Shujian Zhang, Wenxuan Zhou,…
SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipeby Yuxin Xiao, Shujian Zhang, Wenxuan Zhou,…
Differential Transformerby Tianzhu Ye, Li Dong, Yuqing Xia, Yutao Sun, Yi Zhu, Gao Huang, Furu…
DEPT: Decoupled Embeddings for Pre-training Language Modelsby Alex Iacob, Lorenzo Sani, Meghdad Kurmanji, William F.…
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attentionby Lijie Yang, Zhihao Zhang,…
SparsePO: Controlling Preference Alignment of LLMs via Sparse Token Masksby Fenia Christopoulou, Ronald Cardenas, Gerasimos…
Timer-XL: Long-Context Transformers for Unified Time Series Forecastingby Yong Liu, Guo Qin, Xiangdong Huang, Jianmin…
TableRAG: Million-Token Table Understanding with Language Modelsby Si-An Chen, Lesly Miculicich, Julian Martin Eisenschlos, Zifeng…
TLDR: Token-Level Detective Reward Model for Large Vision Language Modelsby Deqing Fu, Tong Xiao, Rui…
Hyperbolic Fine-tuning for Large Language Modelsby Menglin Yang, Aosong Feng, Bo Xiong, Jihong Liu, Irwin…
Detecting Machine-Generated Long-Form Content with Latent-Space Variablesby Yufei Tian, Zeyu Pan, Nanyun PengFirst submitted to…