Summary of Vptq: Extreme Low-bit Vector Post-training Quantization For Large Language Models, by Yifei Liu et al.
VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Modelsby Yifei Liu, Jicheng Wen, Yang…
VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Modelsby Yifei Liu, Jicheng Wen, Yang…
LLaMa-SciQ: An Educational Chatbot for Answering Science MCQby Marc-Antoine Allard, Matin Ansaripour, Maria Yuffa, Paul…
Dynamic-Width Speculative Beam Decoding for Efficient LLM Inferenceby Zongyue Qin, Zifan He, Neha Prakriya, Jason…
Leveraging Unsupervised Learning for Cost-Effective Visual Anomaly Detectionby Yunbo Long, Zhengyang Ling, Sam Brook, Duncan…
From Text to Treatment Effects: A Meta-Learning Approach to Handling Text-Based Confoundingby Henri Arno, Paloma…
FACET: Fast and Accurate Event-Based Eye Tracking Using Ellipse Modeling for Extended Realityby Junyuan Ding,…
Parse Trees Guided LLM Prompt Compressionby Wenhao Mao, Chengbin Hou, Tianyu Zhang, Xinyu Lin, Ke…
Inference-Friendly Models With MixAttentionby Shashank Rajput, Ying Sheng, Sean Owen, Vitaliy ChileyFirst submitted to arxiv…
A-VL: Adaptive Attention for Large Vision-Language Modelsby Junyang Zhang, Mu Yuan, Ruiguang Zhong, Puhan Luo,…
Scalable Federated Unlearning via Isolated and Coded Shardingby Yijing Lin, Zhipeng Gao, Hongyang Du, Dusit…