Loading Now

Summary of Which Llms Are Difficult to Detect? a Detailed Analysis Of Potential Factors Contributing to Difficulties in Llm Text Detection, by Shantanu Thorat and Tianbao Yang


Which LLMs are Difficult to Detect? A Detailed Analysis of Potential Factors Contributing to Difficulties in LLM Text Detection

by Shantanu Thorat, Tianbao Yang

First submitted to arxiv on: 18 Oct 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper investigates the detection of Large Language Model (LLM)-generated texts across various domains. The authors train AI-generated text classifiers using LibAUC, a deep learning library for imbalanced datasets. Results show that LLM-text detection varies across writing domains, with scientific writing being particularly challenging. In the Rewritten Ivy Panda (RIP) dataset focusing on student essays, OpenAI’s LLMs proved difficult to distinguish from human texts. The authors explore possible factors contributing to these difficulties.
Low GrooveSquid.com (original content) Low Difficulty Summary
This study looks at how well computers can tell if a text was written by a machine or a person. Researchers trained special computer programs to identify texts created by large language models. They found that it’s harder to tell the difference between machine-generated and human-written texts in certain types of writing, like science papers. In student essays, texts generated by OpenAI’s language models were very hard for computers to distinguish from real human writing. The researchers tried to figure out why this might be.

Keywords

* Artificial intelligence  * Deep learning  * Large language model