Summary of Which Llms Are Difficult to Detect? a Detailed Analysis Of Potential Factors Contributing to Difficulties in Llm Text Detection, by Shantanu Thorat and Tianbao Yang
Which LLMs are Difficult to Detect? A Detailed Analysis of Potential Factors Contributing to Difficulties in LLM Text Detection
by Shantanu Thorat, Tianbao Yang
First submitted to arxiv on: 18 Oct 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper investigates the detection of Large Language Model (LLM)-generated texts across various domains. The authors train AI-generated text classifiers using LibAUC, a deep learning library for imbalanced datasets. Results show that LLM-text detection varies across writing domains, with scientific writing being particularly challenging. In the Rewritten Ivy Panda (RIP) dataset focusing on student essays, OpenAI’s LLMs proved difficult to distinguish from human texts. The authors explore possible factors contributing to these difficulties. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This study looks at how well computers can tell if a text was written by a machine or a person. Researchers trained special computer programs to identify texts created by large language models. They found that it’s harder to tell the difference between machine-generated and human-written texts in certain types of writing, like science papers. In student essays, texts generated by OpenAI’s language models were very hard for computers to distinguish from real human writing. The researchers tried to figure out why this might be. |
Keywords
* Artificial intelligence * Deep learning * Large language model