Summary of Palmbench: a Comprehensive Benchmark Of Compressed Large Language Models on Mobile Platforms, by Yilong Li et al.
PalmBench: A Comprehensive Benchmark of Compressed Large Language Models on Mobile Platforms
by Yilong Li, Jingyu Liu, Hao Zhang, M Badri Narayanan, Utkarsh Sharma, Shuai Zhang, Pan Hu, Yijing Zeng, Jayaram Raghuram, Suman Banerjee
First submitted to arxiv on: 5 Oct 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper presents a novel approach for evaluating large language models (LLMs) on mobile devices, addressing the challenges of balancing quality, latency, and throughput within hardware constraints. The authors introduce a lightweight, automated benchmarking framework that allows users to evaluate various LLMs with different quantization configurations across multiple mobile platforms. They provide a comprehensive benchmark of popular LLMs, focusing on resource efficiency (memory and power consumption) and harmful output for compressed models. The results show differences in energy efficiency and throughput across mobile platforms, the impact of quantization on memory usage, GPU execution time, and power consumption, accuracy and performance degradation of quantized models compared to non-quantized counterparts, and the frequency of hallucinations and toxic content generated by compressed LLMs. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is about making sure that big language models work well on phones. This is important because sometimes we don’t want to send our data to a remote server for privacy or connection reasons. The authors created a special tool to test these models on different phone types and see how they perform. They looked at how much energy each model uses, how fast it works, and if it makes any mistakes. They found that the models work differently on different phones, and some are better than others. This research is important for people who want to use language models on their phones. |
Keywords
* Artificial intelligence * Quantization