Summary of Mrt5: Dynamic Token Merging For Efficient Byte-level Language Models, by Julie Kallini et al.
MrT5: Dynamic Token Merging for Efficient Byte-level Language Modelsby Julie Kallini, Shikhar Murty, Christopher D.…
MrT5: Dynamic Token Merging for Efficient Byte-level Language Modelsby Julie Kallini, Shikhar Murty, Christopher D.…
Looking Beyond The Top-1: Transformers Determine Top Tokens In Orderby Daria Lioubashevski, Tomer Schlank, Gabriel…
Dynamic Vocabulary Pruning in Early-Exit LLMsby Jort Vincenti, Karim Abdel Sadek, Joan Velja, Matteo Nulli,…
Probabilistic Language-Image Pre-Trainingby Sanghyuk Chun, Wonjae Kim, Song Park, Sangdoo YunFirst submitted to arxiv on:…
Multi-Draft Speculative Sampling: Canonical Architectures and Theoretical Limitsby Ashish Khisti, M.Reza Ebrahimi, Hassan Dbouk, Arash…
AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower…
Future Token Prediction – Causal Language Modelling with Per-Token Semantic State Vector for Multi-Token Predictionby…
Faster Language Models with Better Multi-Token Prediction Using Tensor Decompositionby Artem Basharin, Andrei Chertkov, Ivan…
AMUSD: Asynchronous Multi-Device Speculative Decoding for LLM Accelerationby Bradley McDanelFirst submitted to arxiv on: 22…
TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Samplingby Jiahao Qiu, Yifu Lu, Yifan…