Summary of Hints-in-browser: Benchmarking Language Models For Programming Feedback Generation, by Nachiket Kotalwar et al.

Hints-In-Browser: Benchmarking Language Models for Programming Feedback Generation

by Nachiket Kotalwar, Alkis Gotovos, Adish Singla

First submitted to arxiv on: 7 Jun 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Generative AI and large language models have the potential to revolutionize programming education by providing individualized feedback and hints for learners. Recent works have focused on improving the quality of generated feedback, but this paper takes a different approach. The authors benchmark language models for programming feedback generation across multiple performance criteria, including quality, cost, time, and data privacy. To achieve this, they leverage in-browser inference engines that allow running models directly in the browser, reducing costs and enhancing data privacy. The team develops a fine-tuning pipeline using GPT-4 generated synthetic data to boost the feedback quality of small models compatible with these engines. They showcase the effectiveness of fine-tuned Llama3-8B and Phi3-3.8B 4-bit quantized models on three Python programming datasets using WebLLM’s in-browser inference engine. This paper will be released along with a web app and datasets to facilitate further research.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper is about using artificial intelligence (AI) to help people learn how to program computers. Right now, AI can generate feedback and hints for learners, but most papers have focused on making the feedback better. This paper looks at more than just quality – it also considers cost, time, and privacy. To make this work, they use a special way of running the AI models directly in the browser, which saves money and keeps personal data private. They made some changes to the AI models using fake data generated by another model, and then tested them on three different programming problems. The results show that these changed models are very effective. The authors will share their work so others can continue to improve it.

Keywords

» Artificial intelligence » Fine tuning » Gpt » Inference » Synthetic data

Hints-In-Browser: Benchmarking Language Models for Programming Feedback Generation

by Nachiket Kotalwar, Alkis Gotovos, Adish Singla

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Timesieve: Extracting Temporal Dynamics Through Information Bottlenecks, by Ninghui Feng et al.

Summary of Retrieval & Fine-tuning For In-context Tabular Models, by Valentin Thomas et al.

Related Posts