Summary of Learning Large Softmax Mixtures with Warm Start Em, by Xin Bing and Florentina Bunea and Jonathan Niles-weed and Marten Wegkamp
Learning large softmax mixtures with warm start EMby Xin Bing, Florentina Bunea, Jonathan Niles-Weed, Marten…
Learning large softmax mixtures with warm start EMby Xin Bing, Florentina Bunea, Jonathan Niles-Weed, Marten…
LoCa: Logit Calibration for Knowledge Distillationby Runming Yang, Taiqiang Wu, Yujiu YangFirst submitted to arxiv…
Logit Scaling for Out-of-Distribution Detectionby Andrija Djurisic, Rosanne Liu, Mladen NikolicFirst submitted to arxiv on:…
Implicit Geometry of Next-token Prediction: From Language Sparsity Patterns to Model Representationsby Yize Zhao, Tina…