Summary of A Structure-aware Framework For Learning Device Placements on Computation Graphs, by Shukai Duan et al.
A Structure-Aware Framework for Learning Device Placements on Computation Graphs
by Shukai Duan, Heng Ping, Nikos Kanakaris, Xiongye Xiao, Panagiotis Kyriakis, Nesreen K. Ahmed, Peiyu Zhang, Guixiang Ma, Mihai Capota, Shahin Nazarian, Theodore L. Willke, Paul Bogdan
First submitted to arxiv on: 23 May 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Performance (cs.PF)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper proposes a novel framework for device placement, which is crucial in optimizing neural networks on distributed devices. The framework combines techniques from grouper-placer and encoder-placer architectures, leveraging smaller computation graphs extracted from the OpenVINO toolkit. It consists of five steps: graph coarsening, node representation learning, policy optimization, end-to-end training, and consideration of DAG structure. A model variant inspired by graph parsing networks and complex network analysis enables personalized graph partitioning using an unspecified number of groups. The framework is trained using reinforcement learning with the execution time as a reward. Experiments demonstrate its effectiveness on three benchmark models (Inception-V3, ResNet, BERT) with improved inference speed up to 60.24% compared to CPU execution and common baselines. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary The paper is about finding the best way to place parts of a neural network on different devices so it runs faster. It combines two old approaches to make a new one that’s better at handling complex networks. The framework has five steps: making the graph smaller, learning how to represent nodes, optimizing the placement, training the whole thing together, and considering the structure of the graph. A special version of this framework helps divide the graph into groups in a way that’s personalized for each device. It was trained using a technique called reinforcement learning, where it gets feedback based on how fast the network runs. The results show that this new approach can make the network run up to 60% faster than usual. |
Keywords
» Artificial intelligence » Bert » Encoder » Inference » Neural network » Optimization » Parsing » Reinforcement learning » Representation learning » Resnet