Summary of Kinit at Semeval-2024 Task 8: Fine-tuned Llms For Multilingual Machine-generated Text Detection, by Michal Spiegel and Dominik Macko
KInIT at SemEval-2024 Task 8: Fine-tuned LLMs for Multilingual Machine-Generated Text Detection
by Michal Spiegel, Dominik Macko
First submitted to arxiv on: 21 Feb 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary In this paper, researchers tackle the SemEval-2024 Task 8 challenge, developing a multigenerator, multidomain, and multilingual black-box machine-generated text detection system. This task is crucial for preventing potential misuse of large language models (LLMs) capable of generating human-like texts in multiple languages. The authors employ language identification and parameter-efficient fine-tuning of smaller LLMs for text classification, combining predictions with statistical detection metrics through per-language classification-threshold calibration to improve generalization. Their submitted method achieves competitive results, ranking fourth with a narrow margin behind the winner. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is about detecting fake texts made by big language models that can write in many languages. The authors want to stop these models from being used for bad things. They did this by using smaller language models and special calculations to figure out if text is real or not. This method worked pretty well, coming in fourth place. |
Keywords
» Artificial intelligence » Classification » Fine tuning » Generalization » Parameter efficient » Text classification