Loading Now

Summary of Large Language Models For Document-level Event-argument Data Augmentation For Challenging Role Types, by Joseph Gatto et al.


Large Language Models for Document-Level Event-Argument Data Augmentation for Challenging Role Types

by Joseph Gatto, Parker Seegmiller, Omar Sharif, Sarah M. Preum

First submitted to arxiv on: 5 Mar 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Event Argument Extraction (EAE) is an information extraction problem that has significant limitations when applied to few-shot cross-domain (FSCD) settings. The current solutions for FSCD modeling rely on data augmentation, but these methods are not suitable for real-world EAE contexts where long documents (10+ sentences) and zero/few-shot roles need to be modeled. To address this challenge, we propose two novel LLM-powered data augmentation frameworks that synthesize extractive document-level EAE samples using zero in-domain training data. Our best-performing methods achieve a 16-point increase in F1 score for the extraction of zero-shot role types.
Low GrooveSquid.com (original content) Low Difficulty Summary
Imagine trying to understand what’s happening in long documents, like news articles or books, without much information to work with. This is called Event Argument Extraction (EAE), and it’s very hard when we’re trying to apply it to new situations. To solve this problem, we need better ways to create fake training data that can help machines learn about EAE. We developed two new methods that use large language models to generate fake documents for training. These methods are much better than before, increasing the accuracy of EAE by 16%.

Keywords

* Artificial intelligence  * Data augmentation  * F1 score  * Few shot  * Zero shot