Summary of Gw-moe: Resolving Uncertainty in Moe Router with Global Workspace Theory, by Haoze Wu et al.
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theoryby Haoze Wu, Zihan Qiu, Zili…
GW-MoE: Resolving Uncertainty in MoE Router with Global Workspace Theoryby Haoze Wu, Zihan Qiu, Zili…
WebCanvas: Benchmarking Web Agents in Online Environmentsby Yichen Pan, Dehan Kong, Sida Zhou, Cheng Cui,…
QOG:Question and Options Generation based on Language Modelby Jincheng ZhouFirst submitted to arxiv on: 18…
TroL: Traversal of Layers for Large Language and Vision Modelsby Byung-Kwan Lee, Sangyun Chung, Chae…
Not Eliminate but Aggregate: Post-Hoc Control over Mixture-of-Experts to Address Shortcut Shifts in Natural Language…
Multi-Dimensional Pruning: Joint Channel, Layer and Block Pruning with Latency Constraintby Xinglong Sun, Barath Lakshmanan,…
Adding Conditional Control to Diffusion Models with Reinforcement Learningby Yulai Zhao, Masatoshi Uehara, Gabriele Scalia,…
Soft Prompting for Unlearning in Large Language Modelsby Karuna Bhaila, Minh-Hao Van, Xintao WuFirst submitted…
Unraveling the Mechanics of Learning-Based Demonstration Selection for In-Context Learningby Hui Liu, Wenya Wang, Hao…
A Semantic-Aware Layer-Freezing Approach to Computation-Efficient Fine-Tuning of Language Modelsby Jian Gu, Aldeida Aleti, Chunyang…