Summary of Leveraging Large Language Models in Code Question Answering: Baselines and Issues, by Georgy Andryushchenko et al.

Leveraging Large Language Models in Code Question Answering: Baselines and Issues

by Georgy Andryushchenko, Vladimir Ivanov, Vladimir Makharev, Elizaveta Tukhtina, Aidar Valeev

First submitted to arxiv on: 5 Nov 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper proposes a method for using large language models to answer questions about source code in Python. The approach involves fine-tuning a model on a unified dataset of questions and answers, with varying levels of preprocessing (no grammar correction, grammar correction, and summary generation). The authors analyze the model’s performance using metrics such as BLEU-4, BERTScore F1, BLEURT, and Exact Match, and report the results along with conclusions from manual error analysis. The study highlights the current challenges in the field, including poor-quality public datasets and the positive effect of grammar correction on training data. The findings can inform other researchers working to improve source code question-answering solutions.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper is about using special computer models to help people understand software code better. These models are trained on a big dataset of questions and answers about Python code, and then tested to see how well they work. The authors tried different ways of preparing the training data (no grammar correction, grammar correction, and adding summaries) and looked at how well the models performed using special metrics like BLEU-4 and Exact Match. They found that grammar correction helped improve the model’s performance, but there are still problems with the quality of public datasets in this area. This research can help other people working on similar projects to make better models.

Keywords

* Artificial intelligence * Bleu * Fine tuning * Question answering

Leveraging Large Language Models in Code Question Answering: Baselines and Issues

by Georgy Andryushchenko, Vladimir Ivanov, Vladimir Makharev, Elizaveta Tukhtina, Aidar Valeev

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Investigating Idiomaticity in Word Representations, by Wei He et al.

Summary of Hfgaussian: Learning Generalizable Gaussian Human with Integrated Human Features, by Arnab Dey et al.

Related Posts