Summary of Dynamic Mixture Of Experts: An Auto-tuning Approach For Efficient Transformer Models, by Yongxin Guo et al.
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Modelsby Yongxin Guo, Zhenglin Cheng,…
Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Modelsby Yongxin Guo, Zhenglin Cheng,…
Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Modelsby Yuchen Hu, Chen Chen, Chao-Han Huck…
Understanding the Training and Generalization of Pretrained Transformer for Sequential Decision Makingby Hanzhao Wang, Yu…
Text-to-Model: Text-Conditioned Neural Network Diffusion for Train-Once-for-All Personalizationby Zexi Li, Lingzhi Gao, Chao WuFirst submitted…
Transformers Can Learn Temporal Difference Methods for In-Context Reinforcement Learningby Jiuqi Wang, Ethan Blaser, Hadi…
Scaling-laws for Large Time-series Modelsby Thomas D. P. Edwards, James Alvey, Justin Alsing, Nam H.…
A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generationby Gwanghyun Kim, Alonso…
Advancing Graph Convolutional Networks via General Spectral Waveletsby Nian Liu, Xiaoxin He, Thomas Laurent, Francesco…
Leveraging 2D Information for Long-term Time Series Forecasting with Vanilla Transformersby Xin Cheng, Xiuying Chen,…
A Transformer variant for multi-step forecasting of water level and hydrometeorological sensitivity analysis based on…