Loading Now

Summary of Lightweight Spatial Modeling For Combinatorial Information Extraction From Documents, by Yanfei Dong et al.


Lightweight Spatial Modeling for Combinatorial Information Extraction From Documents

by Yanfei Dong, Lambert Deng, Jiazheng Zhang, Xiaodong Yu, Ting Lin, Francesco Gelli, Soujanya Poria, Wee Sun Lee

First submitted to arxiv on: 8 May 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper proposes KNN-former, a novel approach to document entity classification that leverages the spatial structure of documents. The model incorporates a K-nearest-neighbor (KNN) graph-based attention mechanism, limiting entities’ attention only to their local radius defined by the KNN graph. This approach addresses the one-to-one mapping property in many documents, where one field has only one corresponding entity. Additionally, KNN-former is highly parameter-efficient compared to existing approaches. Experimental results across various datasets demonstrate that the method outperforms baselines for most entity types. The paper also releases a new ID document dataset covering diverse templates and languages, as well as enhanced annotations for an existing dataset.
Low GrooveSquid.com (original content) Low Difficulty Summary
This research paper presents a new way to classify documents based on their content. It’s called KNN-former, and it helps computers understand complex documents with many different parts. The method is designed to work better when there are one-to-one relationships between the document’s parts. The researchers tested their approach on several datasets and found that it worked well compared to other methods. They also released a new dataset of documents and added more information to an existing dataset to help future research in this area.

Keywords

» Artificial intelligence  » Attention  » Classification  » Nearest neighbor  » Parameter efficient