Summary of Toxvidlm: a Multimodal Framework For Toxicity Detection in Code-mixed Videos, by Krishanu Maity et al.

ToxVidLM: A Multimodal Framework for Toxicity Detection in Code-Mixed Videos

by Krishanu Maity, A.S. Poornash, Sriparna Saha, Pushpak Bhattacharyya

First submitted to arxiv on: 31 May 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary A machine learning approach to detecting toxic content in YouTube videos, particularly in low-resource code-mixed languages such as Hindi-English, is introduced in this paper. The researchers develop a Multimodal Multitask framework called ToxVidLM that leverages Language Models (LMs) and incorporates three key modules: the Encoder module, Cross-Modal Synchronization module, and Multitask module. The framework is designed to detect toxic content, as well as analyze sentiment and severity of video utterances. Experiments show that incorporating multiple modalities from videos improves performance in detecting toxic content, achieving an Accuracy and Weighted F1 score of 94.29% and 94.35%, respectively. This research contributes to the development of more effective methods for detecting toxic content in diverse online platforms.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper helps us understand how computers can detect mean or hurtful comments on YouTube videos. Researchers created a special machine learning model called ToxVidLM that looks at multiple parts of a video, like what people say and do, to predict whether the video is toxic or not. They tested their model with 931 videos and found that it works really well, correctly identifying almost all the mean comments. This study can help make online spaces safer for everyone.

Keywords

» Artificial intelligence » Encoder » F1 score » Machine learning

ToxVidLM: A Multimodal Framework for Toxicity Detection in Code-Mixed Videos

by Krishanu Maity, A.S. Poornash, Sriparna Saha, Pushpak Bhattacharyya

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Disrupting Diffusion: Token-level Attention Erasure Attack Against Diffusion-based Customization, by Yisu Liu et al.

Summary of Malt: Multi-scale Action Learning Transformer For Online Action Detection, by Zhipeng Yang et al.

Related Posts