Loading Now

Summary of Rkadiyala at Semeval-2024 Task 8: Black-box Word-level Text Boundary Detection in Partially Machine Generated Texts, by Ram Mohan Rao Kadiyala


RKadiyala at SemEval-2024 Task 8: Black-Box Word-Level Text Boundary Detection in Partially Machine Generated Texts

by Ram Mohan Rao Kadiyala

First submitted to arxiv on: 22 Oct 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The paper addresses the challenge of identifying human-written versus machine-generated texts, particularly at the sentence or paragraph level. Existing approaches often focus on binary classification (entirely human vs entirely machine) and perform well only for specific domains and generators. The authors propose reliable methods to detect which parts of a text are machine generated at a word level, comparing results across different approaches and methods. They evaluate their model’s performance on unseen domains’ and generators’ texts, achieving significant improvements in detection accuracy. The findings also highlight implications for detecting machine-generated outputs from Instruct variants of large language models (LLMs). The authors discuss potential avenues for improvement, emphasizing the importance of reliable text generation attribution.
Low GrooveSquid.com (original content) Low Difficulty Summary
The paper is about identifying whether a text was written by a person or a computer. Right now, most systems can only tell if a whole text is human-written or machine-generated. But what if you want to know which part of a text was written by a computer? The authors developed ways to do this at the word level and compared them with other methods. They tested their model on texts from different domains and generators and found that it performed much better than existing systems. This research has important implications for detecting when large language models (LLMs) are generating text, which is becoming more common.

Keywords

* Artificial intelligence  * Classification  * Text generation