Loading Now

Summary of Convex Distillation: Efficient Compression Of Deep Networks Via Convex Optimization, by Prateek Varshney and Mert Pilanci


Convex Distillation: Efficient Compression of Deep Networks via Convex Optimization

by Prateek Varshney, Mert Pilanci

First submitted to arxiv on: 9 Oct 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper addresses the challenge of deploying complex neural networks on resource-constrained edge devices. Traditional compression methods like distillation and pruning often retain non-convexity, making fine-tuning in real-time challenging. The authors introduce a novel distillation technique that uses convex optimization to efficiently compress the model, eliminating intermediate non-convex activation functions. This approach enables label-free data setting and achieves performance comparable to the original model without post-compression fine-tuning. The method is demonstrated on image classification models for multiple standard datasets and outperforms non-convex distillation approaches in data-limited regimes. This promises significant advantages for deploying high-efficiency, low-footprint models on edge devices, making it a practical choice for real-world applications.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about making computer vision algorithms work better on small devices like smartphones or smart cameras. These devices don’t have enough power to run complex computer programs, so we need new ways to make them work faster and more efficiently. The authors created a new way to shrink down big neural networks into smaller ones that still work well without needing lots of data to train them. They tested this method on several popular datasets and showed it works better than older methods in some cases. This could be very useful for using these algorithms in real-world applications like self-driving cars or surveillance cameras.

Keywords

» Artificial intelligence  » Distillation  » Fine tuning  » Image classification  » Optimization  » Pruning