Loading Now

Summary of Imitate Before Detect: Aligning Machine Stylistic Preference For Machine-revised Text Detection, by Jiaqi Chen et al.


Imitate Before Detect: Aligning Machine Stylistic Preference for Machine-Revised Text Detection

by Jiaqi Chen, Xiaoye Zhu, Tianyang Liu, Ying Chen, Xinhui Chen, Yiwen Yuan, Chak Tou Leong, Zuchao Li, Tang Long, Lei Zhang, Chenyu Yan, Guanghao Mei, Jie Zhang, Lefei Zhang

First submitted to arxiv on: 11 Dec 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed “Imitate Before Detect” (ImBD) approach for detecting machine-revised text generates machine-style token distributions and compares them to determine whether the text has been revised. This method combines style preference optimization (SPO) with a scoring LLM model to calculate the style-conditional probability curvature (Style-CPC), which quantifies log probability differences between original and conditionally sampled texts for effective detection. Experimental results show that ImBD outperforms existing state-of-the-art methods, achieving a 13% increase in AUC for detecting open-source LLM-revised text.
Low GrooveSquid.com (original content) Low Difficulty Summary
A team of researchers has developed a new method to detect when humans have edited or revised text written by artificial intelligence (AI). This is important because AI-generated text can be very convincing and difficult to distinguish from human-written text. The new method, called “Imitate Before Detect”, works by generating a model of the machine-style writing and then comparing it to the text being tested. This helps identify subtle differences that may indicate the text has been revised by humans. In tests, this approach outperformed existing methods in detecting AI-generated text that had been edited or rewritten by humans.

Keywords

» Artificial intelligence  » Auc  » Optimization  » Probability  » Token