Summary of Priority Sampling Of Large Language Models For Compilers, by Dejan Grubisic et al.
Priority Sampling of Large Language Models for Compilers
by Dejan Grubisic, Chris Cummins, Volker Seeker, Hugh Leather
First submitted to arxiv on: 28 Feb 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Computation and Language (cs.CL); Performance (cs.PF)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Large language models have shown great potential in generating and optimizing code. However, widely used sampling methods like Nucleus Sampling suffer from limitations. For low temperatures, it produces repeated samples, while high temperatures result in incoherent samples. Moreover, the temperature coefficient needs to be tuned for each task, restricting its usability. To address these issues, we introduce Priority Sampling, a simple and deterministic technique that generates unique samples ordered by the model’s confidence. This approach supports generation based on regular expressions, providing a controllable exploration process. Experimental results demonstrate that Priority Sampling outperforms Nucleus Sampling for any number of samples, leading to a 2.87% improvement over Oz. Furthermore, it surpasses the autotuner used in label generation for training the original model in just 30 samples. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper talks about how computers can help write code. Right now, there are ways to make code generation more diverse and interesting, but they have some drawbacks. For example, one method might produce similar results or weird results. To fix this, the authors came up with a new way called Priority Sampling. This approach generates unique and useful code pieces based on how confident the computer is in its suggestions. It also allows for structured exploration of ideas using regular expressions. The results show that Priority Sampling works better than other methods and can even improve code generation by 2.87% compared to some other approaches. |
Keywords
* Artificial intelligence * Temperature