Summary of Coupling Speech Encoders with Downstream Text Models, by Ciprian Chelba and Johan Schalkwyk
Coupling Speech Encoders with Downstream Text Modelsby Ciprian Chelba, Johan SchalkwykFirst submitted to arxiv on:…
Coupling Speech Encoders with Downstream Text Modelsby Ciprian Chelba, Johan SchalkwykFirst submitted to arxiv on:…
Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?by Jonathan Hayase, Alisa…
RazorAttention: Efficient KV Cache Compression Through Retrieval Headsby Hanlin Tang, Yang Lin, Jing Lin, Qingsen…
Long Input Sequence Network for Long Time Series Forecastingby Chao Ma, Yikai Hou, Xiang Li,…
Fundamental Limits of Prompt Compression: A Rate-Distortion Framework for Black-Box Language Modelsby Alliot Nagle, Adway…
When Can Transformers Count to n?by Gilad Yehudai, Haim Kaplan, Asma Ghandeharioun, Mor Geva, Amir…
Efficient Visual Transformer by Learnable Token Mergingby Yancheng Wang, Yingzhen YangFirst submitted to arxiv on:…
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inferenceby Qichen Fu, Minsik Cho, Thomas…
Identifying the Source of Generation for Large Language Modelsby Bumjin Park, Jaesik ChoiFirst submitted to…
Patch-Level Training for Large Language Modelsby Chenze Shao, Fandong Meng, Jie ZhouFirst submitted to arxiv…