Summary of Conv_einsum: a Framework For Representation and Fast Evaluation Of Multilinear Operations in Convolutional Tensorial Neural Networks, by Tahseen Rabbani et al.
conv_einsum: A Framework for Representation and Fast Evaluation of Multilinear Operations in Convolutional Tensorial Neural Networks
by Tahseen Rabbani, Jiahao Su, Xiaoyu Liu, David Chan, Geoffrey Sangston, Furong Huang
First submitted to arxiv on: 7 Jan 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Computer Vision and Pattern Recognition (cs.CV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper introduces a new framework for compactifying convolutional neural networks (ConvNets) while maintaining their expressive power. The authors propose reshaping ConvNets into tensorial neural networks (TNNs), which are higher-order tensorizations of layers followed by factorization. This allows them to represent passes through TNNs as sequences of multilinear operations (MLOs). To evaluate these MLOs efficiently, the paper develops a meta-algorithm called conv_einsum that minimizes floating-point operations (FLOPs) and memory usage. The authors demonstrate the effectiveness of their approach through comprehensive experiments on various models, tensor decompositions, and tasks. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper makes ConvNets more efficient by turning them into special kinds of networks called TNNs. They then figure out a way to quickly calculate what happens when you put input through these TNNs. This helps make the calculations faster and use less memory. The authors tested their new method on many different models, ways of breaking down tensors, and types of tasks. It worked really well! |