Summary of Improving Pretraining Data Using Perplexity Correlations, by Tristan Thrush et al.
Improving Pretraining Data Using Perplexity Correlationsby Tristan Thrush, Christopher Potts, Tatsunori HashimotoFirst submitted to arxiv…
Improving Pretraining Data Using Perplexity Correlationsby Tristan Thrush, Christopher Potts, Tatsunori HashimotoFirst submitted to arxiv…
How Does Code Pretraining Affect Language Model Task Performance?by Jackson Petty, Sjoerd van Steenkiste, Tal…
VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generationby Yecheng Wu, Zhuoyang Zhang, Junyu…
Probing self-attention in self-supervised speech models for cross-linguistic differencesby Sai Gopinath, Joselyn RodriguezFirst submitted to…
A Lesion-aware Edge-based Graph Neural Network for Predicting Language Ability in Patients with Post-stroke Aphasiaby…
Revisiting SMoE Language Models by Evaluating Inefficiencies with Task Specific Expert Pruningby Soumajyoti Sarkar, Leonard…