Summary of Vtechagp: An Academic-to-general-audience Text Paraphrase Dataset and Benchmark Models, by Ming Cheng et al.
VTechAGP: An Academic-to-General-Audience Text Paraphrase Dataset and Benchmark Models
by Ming Cheng, Jiaying Gong, Chenhan Yuan, William A. Ingram, Edward Fox, Hoda Eldardiry
First submitted to arxiv on: 7 Nov 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Digital Libraries (cs.DL); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper introduces a novel dataset, VTechAGP, consisting of document-level academic-to-general-audience abstract pairs from 8 colleges over 25 years. The authors also propose a dynamic soft prompt generative language model (DSPT5) to perform text paraphrase tasks. DSPT5 uses a contrastive-generative loss function for training and adopts a crowd-sampling decoding strategy for inference. The model is evaluated alongside state-of-the-art large language models, demonstrating competitive results from the lightweight DSPT5. This paper provides a benchmark dataset and solutions for academic-to-general-audience text paraphrase tasks. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper makes it possible to turn complicated university texts into simpler language that anyone can understand. To do this, they created a special dataset of 25-year-old documents written by experts from 8 universities. They also developed a new way to write using prompts that get better as you use them more. The authors tested their approach and compared it to other popular language models. Surprisingly, their simple model did just as well as the more complex ones! This breakthrough could help make academic texts more accessible to everyone. |
Keywords
» Artificial intelligence » Inference » Language model » Loss function » Prompt