Summary of Balancing Speed and Stability: the Trade-offs Of Fp8 Vs. Bf16 Training in Llms, by Kazuki Fujii et al.
Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMsby Kazuki Fujii, Taishi…
Balancing Speed and Stability: The Trade-offs of FP8 vs. BF16 Training in LLMsby Kazuki Fujii, Taishi…
OWLed: Outlier-weighed Layerwise Pruning for Efficient Autonomous Driving Frameworkby Jiaxi Li, Lu Yin, Xilu WangFirst…
Synthesize, Partition, then Adapt: Eliciting Diverse Samples from Foundation Modelsby Yeming Wen, Swarat ChaudhuriFirst submitted…
HourVideo: 1-Hour Video-Language Understandingby Keshigeyan Chandrasegaran, Agrim Gupta, Lea M. Hadzic, Taran Kota, Jimming He,…
Robust and Efficient Fine-tuning of LLMs with Bayesian Reparameterization of Low-Rank Adaptationby Ayan Sengupta, Vaibhav…
UniGuard: Towards Universal Safety Guardrails for Jailbreak Attacks on Multimodal Large Language Modelsby Sejoon Oh,…
Magnitude Pruning of Large Pretrained Transformer Models with a Mixture Gaussian Priorby Mingxuan Zhang, Yan…
LLaMo: Large Language Model-based Molecular Graph Assistantby Jinyoung Park, Minseong Bae, Dohwan Ko, Hyunwoo J.…
Beyond Autoregression: Fast LLMs via Self-Distillation Through Timeby Justin Deschenaux, Caglar GulcehreFirst submitted to arxiv…
Improving Multimodal Large Language Models Using Continual Learningby Shikhar Srivastava, Md Yousuf Harun, Robik Shrestha,…