Summary of Learning Code Preference Via Synthetic Evolution, by Jiawei Liu et al.

Learning Code Preference via Synthetic Evolution

by Jiawei Liu, Thanh Nguyen, Mingyue Shang, Hantian Ding, Xiaopeng Li, Yu Yu, Varun Kumar, Zijian Wang

First submitted to arxiv on: 4 Oct 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper explores the challenge of assessing code generation based on well-formed properties and aligning it with developer preferences. The researchers propose CodeFavor, a framework for training pairwise code preference models from synthetic evolution data, including code commits and code critiques. They introduce CodePrefBench, a benchmark comprising 1364 rigorously curated code preference tasks covering three verifiable properties: correctness, efficiency, and security, along with human preference. The evaluation shows that CodeFavor improves the accuracy of model-based code preferences by up to 28.8%. Meanwhile, CodeFavor models can match the performance of models with 6-9x more parameters while being 34x more cost-effective. This paper also highlights the prohibitive costs and limitations of human-based code preference: despite spending 23.4 person-minutes on each task, 15.1-40.3% of tasks remain unsolved.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This research is about teaching computers to understand what makes good code. The authors want to know how humans decide if code is good or bad and if computers can learn this too. They created a way for computers to learn from data about code and then tested it against real human preferences. The results show that the computer method works better than expected, even with less training data. This is important because computers could help developers write better code faster. However, humans have limitations when evaluating code themselves – they can spend a lot of time on each task but still not get it right.

Keywords

* Artificial intelligence

Learning Code Preference via Synthetic Evolution

by Jiawei Liu, Thanh Nguyen, Mingyue Shang, Hantian Ding, Xiaopeng Li, Yu Yu, Varun Kumar, Zijian Wang

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Model-based Reward Shaping For Adversarial Inverse Reinforcement Learning in Stochastic Environments, by Simon Sinong Zhan et al.

Summary of Sequential Probability Assignment with Contexts: Minimax Regret, Contextual Shtarkov Sums, and Contextual Normalized Maximum Likelihood, by Ziyi Liu et al.

Related Posts