Summary of Adaptive Inference-time Compute: Llms Can Predict If They Can Do Better, Even Mid-generation, by Rohin Manvi et al.
Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generationby Rohin Manvi,…
Adaptive Inference-Time Compute: LLMs Can Predict if They Can Do Better, Even Mid-Generationby Rohin Manvi,…
Three-in-One: Fast and Accurate Transducer for Hybrid-Autoregressive ASRby Hainan Xu, Travis M. Bartley, Vladimir Bataev,…
ENTP: Encoder-only Next Token Predictionby Ethan Ewer, Daewon Chae, Thomas Zeng, Jinkyu Kim, Kangwook LeeFirst…
Learnable Expansion of Graph Operators for Multi-Modal Feature Fusionby Dexuan Ding, Lei Wang, Liyun Zhu,…
Attention layers provably solve single-location regressionby Pierre Marion, Raphaël Berthier, Gérard Biau, Claire BoyerFirst submitted…
Integrative Decoding: Improve Factuality via Implicit Self-consistencyby Yi Cheng, Xiao Liang, Yeyun Gong, Wen Xiao,…
Sparse Attention Decomposition Applied to Circuit Tracingby Gabriel Franco, Mark CrovellaFirst submitted to arxiv on:…
Characterizing and Efficiently Accelerating Multimodal Generation Model Inferenceby Yejin Lee, Anna Sun, Basil Hosmer, Bilge…
Using pretrained graph neural networks with token mixers as geometric featurizers for conformational dynamicsby Zihan…
SATA: Spatial Autocorrelation Token Analysis for Enhancing the Robustness of Vision Transformersby Nick Nikzad, Yi…