Summary of Tpd: Enhancing Student Language Model Reasoning Via Principle Discovery and Guidance, by Haorui Wang (1) et al.
TPD: Enhancing Student Language Model Reasoning via Principle Discovery and Guidanceby Haorui Wang, Rongzhi Zhang,…
TPD: Enhancing Student Language Model Reasoning via Principle Discovery and Guidanceby Haorui Wang, Rongzhi Zhang,…
DualTeacher: Bridging Coexistence of Unlabelled Classes for Semi-supervised Incremental Object Detectionby Ziqi Yuan, Liyuan Wang,…
On Explaining Knowledge Distillation: Measuring and Visualising the Knowledge Transfer Processby Gereziher Adhane, Mohammad Mahdi…
A Theoretical Analysis of Soft-Label vs Hard-Label Training in Neural Networksby Saptarshi Mandal, Xiaojun Lin,…
Reverse Thinking Makes LLMs Stronger Reasonersby Justin Chih-Yao Chen, Zifeng Wang, Hamid Palangi, Rujun Han,…
Pre-Training Graph Contrastive Masked Autoencoders are Strong Distillers for EEGby Xinxu Wei, Kanhao Zhao, Yong…
Adaptive Group Robust Ensemble Knowledge Distillationby Patrik Kenfack, Ulrich Aïvodji, Samira Ebrahimi KahouFirst submitted to…
Quantifying Knowledge Distillation Using Partial Information Decompositionby Pasan Dissanayake, Faisal Hamman, Barproda Halder, Ilia Sucholutsky,…
Multi-Level Feature Distillation of Joint Teachers Trained on Distinct Image Datasetsby Adrian Iordache, Bogdan Alexe,…
Knowledge Distillation Using Frontier Open-source LLMs: Generalizability and the Role of Synthetic Databy Anup Shirgaonkar,…