Summary of Distildoc: Knowledge Distillation For Visually-rich Document Applications, by Jordy Van Landeghem et al.
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications
by Jordy Van Landeghem, Subhajit Maity, Ayan Banerjee, Matthew Blaschko, Marie-Francine Moens, Josep Lladós, Sanket Biswas
First submitted to arxiv on: 12 Jun 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper explores the application of knowledge distillation (KD) for visually-rich document (VRD) tasks such as document layout analysis (DLA) and document image classification (DIC). The authors design a KD methodology to compress models for efficient performance on DU tasks within larger pipelines. They compare KD strategies (response-based, feature-based) for distilling knowledge to and from backbones with different architectures (ResNet, ViT, DiT) and capacities (base, small, tiny). The results show that some methods can consistently outperform supervised student training, while also highlighting the need to explore efficient model compression for DLA tasks. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper looks at how to make machine learning models better at understanding documents. It’s like trying to teach a student what a teacher knows, but in a way that uses less resources. The researchers tried different methods to do this and found some work really well. They also tested these methods on specific tasks, like recognizing the layout of a document or identifying images within it. The results show that using less resource-intensive models can still get good results, which is important for tasks that need to be done quickly and efficiently. |
Keywords
» Artificial intelligence » Image classification » Knowledge distillation » Machine learning » Model compression » Resnet » Supervised » Vit