Summary of Perseval: Assessing Personalization in Text Summarizers, by Sourish Dasgupta et al.
PerSEval: Assessing Personalization in Text Summarizers
by Sourish Dasgupta, Ankush Chander, Parth Borad, Isha Motiyani, Tanmoy Chakraborty
First submitted to arxiv on: 29 Jun 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper challenges the existing evaluation metrics for personalized text summarizers, arguing that accuracy measures such as BLEU, ROUGE, and METEOR are inadequate for evaluating personalization. Instead, it proposes EGISES, a new metric designed to measure the degree of responsiveness in these models. The authors demonstrate theoretically and empirically that EGISES only captures part of what makes a summary personalized. To address this, they introduce PerSEval, a novel metric that measures the degree of personalization more comprehensively. They benchmark ten state-of-the-art summarization models on the PENS dataset, showing that PerSEval has high rank-stability and is reliable in terms of human-judgment correlation. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is about how we evaluate AI models that summarize text to make it easier for people to understand. Right now, we use measures like accuracy to see if these models are good or not. But the authors think this isn’t enough because it doesn’t capture how well the model understands what makes a summary personalized. They introduce a new metric called EGISES, which shows that only measuring accuracy is too simple. Instead, they propose PerSEval, a new way to measure how well these models understand what people want in a summary. |
Keywords
» Artificial intelligence » Bleu » Rouge » Summarization