Summary of Hashattention: Semantic Sparsity For Faster Inference, by Aditya Desai et al.
HashAttention: Semantic Sparsity for Faster Inferenceby Aditya Desai, Shuo Yang, Alejandro Cuadron, Ana Klimovic, Matei…
HashAttention: Semantic Sparsity for Faster Inferenceby Aditya Desai, Shuo Yang, Alejandro Cuadron, Ana Klimovic, Matei…
FaultExplainer: Leveraging Large Language Models for Interpretable Fault Detection and Diagnosisby Abdullah Khan, Rahul Nahar,…
Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LNby Pengxiang Li, Lu…
Lightweight Safety Classification Using Pruned Language Modelsby Mason Sawtell, Tula Masterman, Sandi Besen, Jim BrownFirst…
Towards LLM-based optimization compilers. Can LLMs learn how to apply a single peephole optimization? Reasoning…
The Open Source Advantage in Large Language Models (LLMs)by Jiya Manchanda, Laura Boettcher, Matheus Westphalen,…
Frontier AI systems have surpassed the self-replicating red lineby Xudong Pan, Jiarun Dai, Yihe Fan,…
No More Adam: Learning Rate Scaling at Initialization is All You Needby Minghao Xu, Lichuan…
SciFaultyQA: Benchmarking LLMs on Faulty Science Question Detection with a GAN-Inspired Approach to Synthetic Dataset…
SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Modelsby Jiale Cheng, Xiao…