Summary of Toxvidlm: a Multimodal Framework For Toxicity Detection in Code-mixed Videos, by Krishanu Maity et al.
ToxVidLM: A Multimodal Framework for Toxicity Detection in Code-Mixed Videos
by Krishanu Maity, A.S. Poornash, Sriparna Saha, Pushpak Bhattacharyya
First submitted to arxiv on: 31 May 2024
Categories
- Main: Artificial Intelligence (cs.AI)
- Secondary: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary A machine learning approach to detecting toxic content in YouTube videos, particularly in low-resource code-mixed languages such as Hindi-English, is introduced in this paper. The researchers develop a Multimodal Multitask framework called ToxVidLM that leverages Language Models (LMs) and incorporates three key modules: the Encoder module, Cross-Modal Synchronization module, and Multitask module. The framework is designed to detect toxic content, as well as analyze sentiment and severity of video utterances. Experiments show that incorporating multiple modalities from videos improves performance in detecting toxic content, achieving an Accuracy and Weighted F1 score of 94.29% and 94.35%, respectively. This research contributes to the development of more effective methods for detecting toxic content in diverse online platforms. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper helps us understand how computers can detect mean or hurtful comments on YouTube videos. Researchers created a special machine learning model called ToxVidLM that looks at multiple parts of a video, like what people say and do, to predict whether the video is toxic or not. They tested their model with 931 videos and found that it works really well, correctly identifying almost all the mean comments. This study can help make online spaces safer for everyone. |
Keywords
» Artificial intelligence » Encoder » F1 score » Machine learning