Loading Now

Summary of Palmbench: a Comprehensive Benchmark Of Compressed Large Language Models on Mobile Platforms, by Yilong Li et al.


PalmBench: A Comprehensive Benchmark of Compressed Large Language Models on Mobile Platforms

by Yilong Li, Jingyu Liu, Hao Zhang, M Badri Narayanan, Utkarsh Sharma, Shuai Zhang, Pan Hu, Yijing Zeng, Jayaram Raghuram, Suman Banerjee

First submitted to arxiv on: 5 Oct 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper presents a novel approach for evaluating large language models (LLMs) on mobile devices, addressing the challenges of balancing quality, latency, and throughput within hardware constraints. The authors introduce a lightweight, automated benchmarking framework that allows users to evaluate various LLMs with different quantization configurations across multiple mobile platforms. They provide a comprehensive benchmark of popular LLMs, focusing on resource efficiency (memory and power consumption) and harmful output for compressed models. The results show differences in energy efficiency and throughput across mobile platforms, the impact of quantization on memory usage, GPU execution time, and power consumption, accuracy and performance degradation of quantized models compared to non-quantized counterparts, and the frequency of hallucinations and toxic content generated by compressed LLMs.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about making sure that big language models work well on phones. This is important because sometimes we don’t want to send our data to a remote server for privacy or connection reasons. The authors created a special tool to test these models on different phone types and see how they perform. They looked at how much energy each model uses, how fast it works, and if it makes any mistakes. They found that the models work differently on different phones, and some are better than others. This research is important for people who want to use language models on their phones.

Keywords

* Artificial intelligence  * Quantization