Summary of Noise-aware Training Of Layout-aware Language Models, by Ritesh Sarkhel et al.

Noise-Aware Training of Layout-Aware Language Models

by Ritesh Sarkhel, Xiaoqi Ren, Lauro Beltrao Costa, Guolong Su, Vincent Perot, Yanan Xie, Emmanouil Koukoumidis, Arnab Nandi

First submitted to arxiv on: 30 Mar 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The proposed Noise-Aware Training (NAT) method addresses the challenge of training custom extractors for thousands of different document types in a scalable way. Existing approaches require expensive human-labeled instances, which surpass the maximum allowable training time allocated for the extractor. NAT utilizes weakly labeled documents to train an extractor, estimating the confidence of each training sample as uncertainty measure during training. This method outperforms transfer-learning baselines by up to 6% in terms of macro-F1 score and reduces human-effort required to obtain comparable performance by up to 73%.
Low	GrooveSquid.com (original content)	Low Difficulty Summary A new way to teach computers to understand documents is proposed. Instead of using lots of expensive labeled examples, this method uses weakly labeled ones to train a computer program to extract important information. This makes it much faster and cheaper to train the program. The program can then be used to quickly learn from many different types of documents.

Keywords

* Artificial intelligence * F1 score * Transfer learning

Noise-Aware Training of Layout-Aware Language Models

by Ritesh Sarkhel, Xiaoqi Ren, Lauro Beltrao Costa, Guolong Su, Vincent Perot, Yanan Xie, Emmanouil Koukoumidis, Arnab Nandi

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Cross-lingual Named Entity Corpus For Slavic Languages, by Jakub Piskorski et al.

Summary of Convolutional Bayesian Filtering, by Wenhan Cao et al.

Related Posts