Summary of Aggregated Knowledge Model: Enhancing Domain-specific Qa with Fine-tuned and Retrieval-augmented Generation Models, by Fengchen Liu et al.
Aggregated Knowledge Model: Enhancing Domain-Specific QA with Fine-Tuned and Retrieval-Augmented Generation Models
by Fengchen Liu, Jordan Jung, Wei Feinstein, Jeff DAmbrogia, Gary Jung
First submitted to arxiv on: 24 Oct 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This novel approach enhances closed-domain Question Answering (QA) systems by comparing two fine-tuned large language models and five retrieval-augmented generation (RAG) models on the Lawrence Berkeley National Laboratory (LBL) Science Information Technology (ScienceIT) domain. The study leverages a rich dataset derived from the ScienceIT documentation, transforming it into structured context-question-answer triples using Large Language Models (AWS Bedrock, GCP PaLM2, Meta LLaMA2, OpenAI GPT-4, Google Gemini-Pro). The Aggregated Knowledge Model (AKM) is introduced to synthesize responses from these models using K-means clustering. Evaluation metrics reveal the effectiveness and suitability of each model for the LBL ScienceIT environment. Fine-tuning and retrieval-augmented strategies offer significant performance improvements with the AKM, demonstrating its potential benefits. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper makes a special kind of computer program better at answering questions about science topics. The researchers compared different ways to improve this program’s understanding of science information. They used a large dataset from a science library and tested several approaches, including using very smart computer models. One new idea is to combine the answers from these different models to get the best answer. This study shows that combining these methods can make the program much better at answering questions about specific science topics. |
Keywords
» Artificial intelligence » Clustering » Fine tuning » Gemini » Gpt » K means » Question answering » Rag » Retrieval augmented generation