Loading Now

Summary of Enhancing Llm-based Hatred and Toxicity Detection with Meta-toxic Knowledge Graph, by Yibo Zhao et al.


Enhancing LLM-based Hatred and Toxicity Detection with Meta-Toxic Knowledge Graph

by Yibo Zhao, Jiapeng Zhu, Can Xu, Xiang Li

First submitted to arxiv on: 17 Dec 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed MetaTox method addresses two key challenges in Large Language Models (LLMs) for toxicity detection: false negatives due to a lack of domain-specific toxic knowledge and false positives due to excessive sensitivity to toxic speech, which limits freedom of speech. To achieve this, MetaTox leverages graph search on a meta-toxic knowledge graph, enhancing hatred and toxicity detection. A comprehensive meta-toxic knowledge graph is constructed by extracting toxic information from LLMs through a three-step pipeline using toxic benchmark datasets as corpora. This is followed by querying the graph via retrieval and ranking processes to supplement accurate, relevant toxic knowledge. Experimental results demonstrate that MetaTox significantly decreases false positives while boosting overall toxicity detection performance.
Low GrooveSquid.com (original content) Low Difficulty Summary
MetaTox helps detect online content toxicity more accurately. Right now, Large Language Models are used for this task but they have some big problems. They often miss toxic content because they don’t know enough about specific kinds of hate speech. On the other hand, they’re too good at finding toxic language and might flag things that aren’t actually harmful. The MetaTox method tries to fix these issues by creating a special graph that helps LLMs learn more about different types of hate speech. This graph is built using information from many different places online. Then, when an LLM wants to identify toxic content, it can ask the graph for help and get more accurate results.

Keywords

» Artificial intelligence  » Boosting  » Knowledge graph