Summary of Decoding Compressed Trust: Scrutinizing the Trustworthiness Of Efficient Llms Under Compression, by Junyuan Hong et al.
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
by Junyuan Hong, Jinhao Duan, Chenhui Zhang, Zhangheng Li, Chulin Xie, Kelsey Lieberman, James Diffenderfer, Brian Bartoldson, Ajay Jaiswal, Kaidi Xu, Bhavya Kailkhura, Dan Hendrycks, Dawn Song, Zhangyang Wang, Bo Li
First submitted to arxiv on: 18 Mar 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper investigates the relationship between compressing Large Language Models (LLMs) and their trustworthiness. While compression methods have improved benign task performance, the risks to safety and trustworthiness were overlooked. The study evaluates three leading LLMs using five state-of-the-art compression techniques across eight trustworthiness dimensions. It finds that quantization is a more effective approach than pruning in achieving efficiency and trustworthiness simultaneously. For instance, 4-bit quantized models retain the trustworthiness of their originals, but model pruning significantly degrades trustworthiness even at 50% sparsity. The paper highlights the importance of comprehensive trustworthiness evaluation to ensure high utility, efficiency, and trustworthiness in LLMs. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Compressing big language models helps them work better with less energy. This is good for computers, but what does it mean for how trustworthy these models are? Researchers looked at three popular models using five ways to make them smaller and tested how well they worked across eight areas that affect trustworthiness. They found that one way, called quantization, works better than another way, pruning. For example, making a model 4 bits smaller didn’t hurt its trustworthiness, but making it 50% smaller did. The study shows that we need to think about how trustworthy these models are when we make them work more efficiently. |
Keywords
» Artificial intelligence » Pruning » Quantization