Summary of Megascale: Scaling Large Language Model Training to More Than 10,000 Gpus, by Ziheng Jiang et al.
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUsby Ziheng Jiang, Haibin Lin,…
MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUsby Ziheng Jiang, Haibin Lin,…
Language-Based User Profiles for Recommendationby Joyce Zhou, Yijia Dai, Thorsten JoachimsFirst submitted to arxiv on:…
Learning Cyclic Causal Models from Incomplete Databy Muralikrishnna G. Sethuraman, Faramarz FekriFirst submitted to arxiv…
Smooth and Sparse Latent Dynamics in Operator Learning with Jerk Regularizationby Xiaoyu Xie, Saviz Mowlavi,…
Fair Resource Allocation in Multi-Task Learningby Hao Ban, Kaiyi JiFirst submitted to arxiv on: 23…
Uniformly Safe RL with Objective Suppression for Multi-Constraint Safety-Critical Applicationsby Zihan Zhou, Jonathan Booher, Khashayar…
Contact Complexity in Customer Serviceby Shu-Ting Pi, Michael Yang, Qun LiuFirst submitted to arxiv on:…
Learning Semilinear Neural Operators : A Unified Recursive Framework For Prediction And Data Assimilationby Ashutosh…
Teacher-Student Learning on Complexity in Intelligent Routingby Shu-Ting Pi, Michael Yang, Yuying Zhu, Qun LiuFirst…
Scalable Density-based Clustering with Random Projectionsby Haochuan Xu, Ninh PhamFirst submitted to arxiv on: 24…