Summary of Efficient World Models with Context-aware Tokenization, by Vincent Micheli et al.
Efficient World Models with Context-Aware Tokenizationby Vincent Micheli, Eloi Alonso, François FleuretFirst submitted to arxiv…
Efficient World Models with Context-Aware Tokenizationby Vincent Micheli, Eloi Alonso, François FleuretFirst submitted to arxiv…
All Random Features Representations are Equivalentby Luke Sernau, Silvano Bonacina, Rif A. SaurousFirst submitted to…
A Closer Look into Mixture-of-Experts in Large Language Modelsby Ka Man Lo, Zeyu Huang, Zihan…
Transformer Normalisation Layers and the Independence of Semantic Subspacesby Stephen Menary, Samuel Kaski, Andre FreitasFirst…
Interpreting Attention Layer Outputs with Sparse Autoencodersby Connor Kissane, Robert Krzyzanowski, Joseph Isaac Bloom, Arthur…
Paraphrase and Aggregate with Large Language Models for Minimizing Intent Classification Errorsby Vikas Yadav, Zheng…
Are Language Models Actually Useful for Time Series Forecasting?by Mingtian Tan, Mike A. Merrill, Vinayak…
Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformersby Chao Lou,…
ShadowLLM: Predictor-based Contextual Sparsity for Large Language Modelsby Yash Akhauri, Ahmed F AbouElhamayed, Jordan Dotzel,…
Found in the Middle: Calibrating Positional Attention Bias Improves Long Context Utilizationby Cheng-Yu Hsieh, Yung-Sung…