Summary of Optimized Multi-token Joint Decoding with Auxiliary Model For Llm Inference, by Zongyue Qin et al.
Optimized Multi-Token Joint Decoding with Auxiliary Model for LLM Inferenceby Zongyue Qin, Ziniu Hu, Zifan…
Optimized Multi-Token Joint Decoding with Auxiliary Model for LLM Inferenceby Zongyue Qin, Ziniu Hu, Zifan…
Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradientsby Zhenyu Zhang, Ajay Jaiswal, Lu…
RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantizationby Xijie Huang, Zechun Liu, Shih-Yang Liu,…
ConvNLP: Image-based AI Text Detectionby Suriya Prakash Jambunathan, Ashwath Shankarnarayan, Parijat DubeFirst submitted to arxiv…
An Empirical Comparison of Vocabulary Expansion and Initialization Approaches for Language Modelsby Nandini Mundra, Aditya…
Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigationby Yi-Chen Li, Fuxiang Zhang, Wenjie…
Self-Evaluation as a Defense Against Adversarial Attacks on LLMsby Hannah Brown, Leon Lin, Kenji Kawaguchi,…
RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMsby John Dang, Arash Ahmadian,…
Enhancing Stability for Large Language Models Training in Constrained Bandwidth Networksby Yun Dai, Tejas Dharamsi,…
Badllama 3: removing safety finetuning from Llama 3 in minutesby Dmitrii VolkovFirst submitted to arxiv…