Summary of Tiled Bit Networks: Sub-bit Neural Network Compression Through Reuse Of Learnable Binary Vectors, by Matt Gorbett et al.
Tiled Bit Networks: Sub-Bit Neural Network Compression Through Reuse of Learnable Binary Vectors
by Matt Gorbett, Hossein Shirazi, Indrakshi Ray
First submitted to arxiv on: 16 Jul 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The proposed method, Binary Neural Networks (BNNs), enables efficient deep learning by reducing storage and computational costs. However, as neural networks grow in size, meeting computational requirements remains a challenge. To address this issue, researchers propose a new form of quantization to tile neural network layers with sequences of bits, achieving sub-bit compression of binary-weighted neural networks. The method learns binary vectors (tiles) to populate each layer via aggregation and reshaping operations. During inference, the method reuses a single tile per layer to represent the full tensor. This approach is employed on both fully-connected and convolutional layers, making up the breadth of space in most neural architectures. Empirically, the approach achieves near-full-precision performance on diverse architectures (CNNs, Transformers, MLPs) and tasks (classification, segmentation, and time series forecasting) with an 8x reduction in size compared to binary-weighted models. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper proposes a new way to make neural networks smaller and faster. Neural networks are important for AI, but they can be very big and take up lots of computer memory and processing power. The researchers found a way to make them smaller by breaking down the numbers that make up the network into tiny pieces called “tiles.” This makes it possible to use much less memory and processing power while still getting almost the same results as before. The new method works for different types of neural networks (like image recognition or language translation) and can even be used on small computers like microcontrollers. |
Keywords
» Artificial intelligence » Classification » Deep learning » Inference » Neural network » Precision » Quantization » Time series » Translation