Summary of Align-to-distill: Trainable Attention Alignment For Knowledge Distillation in Neural Machine Translation, by Heegon Jin et al.
Align-to-Distill: Trainable Attention Alignment for Knowledge Distillation in Neural Machine Translationby Heegon Jin, Seonil Son,…