Summary of Hobbit: a Mixed Precision Expert Offloading System For Fast Moe Inference, by Peng Tang et al.
HOBBIT: A Mixed Precision Expert Offloading System for Fast MoE Inferenceby Peng Tang, Jiacheng Liu,…
HOBBIT: A Mixed Precision Expert Offloading System for Fast MoE Inferenceby Peng Tang, Jiacheng Liu,…
Online Relational Inference for Evolving Multi-agent Interacting Systemsby Beomseok Kang, Priyabrata Saha, Sudarshan Sharma, Biswadeep…
Task-Aware Harmony Multi-Task Decision Transformer for Offline Reinforcement Learningby Ziqing Fan, Shengchao Hu, Yuhang Zhou,…
Supervised Score-Based Modeling by Gradient Boostingby Changyuan Zhao, Hongyang Du, Guangyuan Liu, Dusit NiyatoFirst submitted…
Hollowed Net for On-Device Personalization of Text-to-Image Diffusion Modelsby Wonguk Cho, Seokeon Choi, Debasmit Das,…
MESS+: Energy-Optimal Inferencing in Language Model Zoos with Service Level Guaranteesby Ryan Zhang, Herbert Woisetschläger,…
A Theoretical Perspective for Speculative Decoding Algorithmby Ming Yin, Minshuo Chen, Kaixuan Huang, Mengdi WangFirst…
Extralonger: Toward a Unified Perspective of Spatial-Temporal Factors for Extra-Long-Term Traffic Forecastingby Zhiwei Zhang, Shaojun…
Is Multiple Object Tracking a Matter of Specialization?by Gianluca Mancusi, Mattia Bernardi, Aniello Panariello, Angelo…
Dual Low-Rank Adaptation for Continual Learning with Pre-Trained Modelsby Huancheng Chen, Jingtao Li, Nidham Gazagnadou,…