Summary of Think: Thinner Key Cache by Query-driven Pruning, By Yuhui Xu et al.
ThinK: Thinner Key Cache by Query-Driven Pruningby Yuhui Xu, Zhanming Jie, Hanze Dong, Lei Wang,…
ThinK: Thinner Key Cache by Query-Driven Pruningby Yuhui Xu, Zhanming Jie, Hanze Dong, Lei Wang,…
Evaluating Large Language Models for automatic analysis of teacher simulationsby David de-Fitero-Dominguez, Mariano Albaladejo-González, Antonio…
Meta-Rewarding Language Models: Self-Improving Alignment with LLM-as-a-Meta-Judgeby Tianhao Wu, Weizhe Yuan, Olga Golovneva, Jing Xu,…
Dallah: A Dialect-Aware Multimodal Large Language Model for Arabicby Fakhraddin Alwajih, Gagan Bhatia, Muhammad Abdul-MageedFirst…
FLRT: Fluent Student-Teacher Redteamingby T. Ben Thompson, Michael SklarFirst submitted to arxiv on: 24 Jul…
Odyssey: Empowering Minecraft Agents with Open-World Skillsby Shunyu Liu, Yaoru Li, Kongcheng Zhang, Zhenyu Cui,…
Step-by-Step Reasoning to Solve Grid Puzzles: Where do LLMs Falter?by Nemika Tyagi, Mihir Parmar, Mohith…
GoldFinch: High Performance RWKV/Transformer Hybrid with Linear Pre-Fill and Extreme KV-Cache Compressionby Daniel Goldstein, Fares…
The Two Sides of the Coin: Hallucination Generation and Detection with LLMs as Evaluators for…
MUSCLE: A Model Update Strategy for Compatible LLM Evolutionby Jessica Echterhoff, Fartash Faghri, Raviteja Vemulapalli,…