Summary of Rethinking Kullback-leibler Divergence in Knowledge Distillation For Large Language Models, by Taiqiang Wu et al.
Rethinking Kullback-Leibler Divergence in Knowledge Distillation for Large Language Modelsby Taiqiang Wu, Chaofan Tao, Jiahao…