Summary of Benchmarking Zero-shot Stance Detection with Flant5-xxl: Insights From Training Data, Prompting, and Decoding Strategies Into Its Near-sota Performance, by Rachith Aiyappa et al.

Benchmarking zero-shot stance detection with FlanT5-XXL: Insights from training data, prompting, and decoding strategies into its near-SoTA performance

by Rachith Aiyappa, Shruthi Senthilmani, Jisun An, Haewoon Kwak, Yong-Yeol Ahn

First submitted to arxiv on: 1 Mar 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The paper investigates the performance of large language model (LLM)-based zero-shot stance detection on tweets using FlanT5-XXL. The authors study the effects of different prompts, decoding strategies, and potential biases on the model’s performance using three datasets from SemEval 2016 Tasks 6A, 6B, and P-Stance. The results show that the zero-shot approach can match or outperform state-of-the-art benchmarks, including fine-tuned models. The authors provide insights into the model’s sensitivity to instructions, decoding strategies, perplexity of prompts, and negations/oppositions present in prompts.
Low	GrooveSquid.com (original content)	Low Difficulty Summary The paper looks at how well a special kind of AI model can figure out what people think about something by reading tweets without being taught beforehand. They used a really powerful language model called FlanT5-XXL to see if it could do this job as well or better than other models that were trained specifically for this task. The results showed that the zero-shot approach, where the AI doesn’t learn from any specific data, can be just as good or even better than the trained models. This paper helps us understand how these AI models work and what makes them so good at understanding people’s opinions.

Keywords

* Artificial intelligence * Language model * Large language model * Perplexity * Zero shot

Benchmarking zero-shot stance detection with FlanT5-XXL: Insights from training data, prompting, and decoding strategies into its near-SoTA performance

by Rachith Aiyappa, Shruthi Senthilmani, Jisun An, Haewoon Kwak, Yong-Yeol Ahn

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Med-real2sim: Non-invasive Medical Digital Twins Using Physics-informed Self-supervised Learning, by Keying Kuang et al.

Summary of A Survey Of Route Recommendations: Methods, Applications, and Opportunities, by Shiming Zhang et al.

Related Posts