Loading Now

Summary of Priority Sampling Of Large Language Models For Compilers, by Dejan Grubisic et al.


Priority Sampling of Large Language Models for Compilers

by Dejan Grubisic, Chris Cummins, Volker Seeker, Hugh Leather

First submitted to arxiv on: 28 Feb 2024

Categories

  • Main: Machine Learning (cs.LG)
  • Secondary: Computation and Language (cs.CL); Performance (cs.PF)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Large language models have shown great potential in generating and optimizing code. However, widely used sampling methods like Nucleus Sampling suffer from limitations. For low temperatures, it produces repeated samples, while high temperatures result in incoherent samples. Moreover, the temperature coefficient needs to be tuned for each task, restricting its usability. To address these issues, we introduce Priority Sampling, a simple and deterministic technique that generates unique samples ordered by the model’s confidence. This approach supports generation based on regular expressions, providing a controllable exploration process. Experimental results demonstrate that Priority Sampling outperforms Nucleus Sampling for any number of samples, leading to a 2.87% improvement over Oz. Furthermore, it surpasses the autotuner used in label generation for training the original model in just 30 samples.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper talks about how computers can help write code. Right now, there are ways to make code generation more diverse and interesting, but they have some drawbacks. For example, one method might produce similar results or weird results. To fix this, the authors came up with a new way called Priority Sampling. This approach generates unique and useful code pieces based on how confident the computer is in its suggestions. It also allows for structured exploration of ideas using regular expressions. The results show that Priority Sampling works better than other methods and can even improve code generation by 2.87% compared to some other approaches.

Keywords

* Artificial intelligence  * Temperature