Summary of Optimization and Scalability Of Collaborative Filtering Algorithms in Large Language Models, by Haowei Yang et al.
Optimization and Scalability of Collaborative Filtering Algorithms in Large Language Modelsby Haowei Yang, Longfei Yun,…
Optimization and Scalability of Collaborative Filtering Algorithms in Large Language Modelsby Haowei Yang, Longfei Yun,…
Compression for Better: A General and Stable Lossless Compression Frameworkby Boyang Zhang, Daning Cheng, Yunquan…
LLMCBench: Benchmarking Large Language Model Compression for Efficient Deploymentby Ge Yang, Changyi He, Jinyang Guo,…
EoRA: Training-free Compensation for Compressed LLM with Eigenspace Low-Rank Approximationby Shih-Yang Liu, Maksim Khadkevich, Nai…
Fine-Tuning and Deploying Large Language Models Over Edges: Issues and Approachesby Yanjie Dong, Haijun Zhang,…
Q-DiT: Accurate Post-Training Quantization for Diffusion Transformersby Lei Chen, Yuan Meng, Chen Tang, Xinzhu Ma,…
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Mergingby Deyuan Liu, Zhanyue Qin,…
Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformerby Huihong Shi, Haikuo Shao, Wendong…
On Linearizing Structured Data in Encoder-Decoder Language Models: Insights from Text-to-SQLby Yutong Shao, Ndapa NakasholeFirst…
Streamlining Redundant Layers to Compress Large Language Modelsby Xiaodong Chen, Yuxuan Hu, Jing Zhang, Yanling…