Summary of Distildoc: Knowledge Distillation For Visually-rich Document Applications, by Jordy Van Landeghem et al.

DistilDoc: Knowledge Distillation for Visually-Rich Document Applications

by Jordy Van Landeghem, Subhajit Maity, Ayan Banerjee, Matthew Blaschko, Marie-Francine Moens, Josep Lladós, Sanket Biswas

First submitted to arxiv on: 12 Jun 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper explores the application of knowledge distillation (KD) for visually-rich document (VRD) tasks such as document layout analysis (DLA) and document image classification (DIC). The authors design a KD methodology to compress models for efficient performance on DU tasks within larger pipelines. They compare KD strategies (response-based, feature-based) for distilling knowledge to and from backbones with different architectures (ResNet, ViT, DiT) and capacities (base, small, tiny). The results show that some methods can consistently outperform supervised student training, while also highlighting the need to explore efficient model compression for DLA tasks.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This paper looks at how to make machine learning models better at understanding documents. It’s like trying to teach a student what a teacher knows, but in a way that uses less resources. The researchers tried different methods to do this and found some work really well. They also tested these methods on specific tasks, like recognizing the layout of a document or identifying images within it. The results show that using less resource-intensive models can still get good results, which is important for tasks that need to be done quickly and efficiently.

Keywords

» Artificial intelligence » Image classification » Knowledge distillation » Machine learning » Model compression » Resnet » Supervised » Vit

DistilDoc: Knowledge Distillation for Visually-Rich Document Applications

by Jordy Van Landeghem, Subhajit Maity, Ayan Banerjee, Matthew Blaschko, Marie-Francine Moens, Josep Lladós, Sanket Biswas

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Sources Of Gain: Decomposing Performance in Conditional Average Dose Response Estimation, by Christopher Bockel-rickermann et al.

Summary of Differentiable Cost-parameterized Monge Map Estimators, by Samuel Howard et al.

Related Posts