Summary of Crmsp: a Semi-supervised Approach For Key Information Extraction with Class-rebalancing and Merged Semantic Pseudo-labeling, by Qi Zhang et al.

CRMSP: A Semi-supervised Approach for Key Information Extraction with Class-Rebalancing and Merged Semantic Pseudo-Labeling

by Qi Zhang, Yonghong Song, Pengcheng Guo, Yangyang Hui

First submitted to arxiv on: 19 Jul 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The proposed semi-supervised approach, Class-Rebalancing and Merged Semantic Pseudo-Labeling (CRMSP), addresses the challenges of underestimating confidence in long-tailed distributions and achieving intra-class compactness and inter-class separability. CRMSP consists of two modules: Class-Rebalancing Pseudo-Labeling (CRP) and Merged Semantic Pseudo-Labeling (MSP). CRP introduces a reweighting factor to rebalance pseudo-labels, increasing attention to tail classes. MSP clusters unlabeled data by assigning samples to Merged Prototypes (MP), utilizing a new contrastive loss designed specifically for this module. Experimental results on three benchmarks demonstrate state-of-the-art performance, achieving 3.24% f1-score improvement over the current state-of-the-art on the CORD dataset.
Low	GrooveSquid.com (original content)	Low Difficulty Summary CRMSP is a new approach to key information extraction that uses semi-supervised learning to save time and money. This method helps by giving more attention to rare classes and making it easier to tell different classes apart. It works by using two special modules: one that adjusts how important each class is, and another that groups similar data points together. The results show that CRMSP does better than other methods on three well-known datasets.

Keywords

» Artificial intelligence » Attention » Contrastive loss » F1 score » Semi supervised

CRMSP: A Semi-supervised Approach for Key Information Extraction with Class-Rebalancing and Merged Semantic Pseudo-Labeling

by Qi Zhang, Yonghong Song, Pengcheng Guo, Yangyang Hui

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of A Reinforcement Learning Strategy to Automate and Accelerate H/p-multigrid Solvers, by David Huergo et al.

Summary of Exploring and Addressing Reward Confusion in Offline Preference Learning, by Xin Chen et al.

Related Posts