Summary of Llm-rank: a Graph Theoretical Approach to Pruning Large Language Models, by David Hoffmann et al.
LLM-Rank: A Graph Theoretical Approach to Pruning Large Language Modelsby David Hoffmann, Kailash Budhathoki, Matthaeus…
LLM-Rank: A Graph Theoretical Approach to Pruning Large Language Modelsby David Hoffmann, Kailash Budhathoki, Matthaeus…
Precipitation Nowcasting Using Diffusion Transformer with Causal Attentionby ChaoRong Li, XuDong Ling, YiLan Xue, Wenjie…
An Evolved Universal Transformer Memoryby Edoardo Cetin, Qi Sun, Tianyu Zhao, Yujin TangFirst submitted to…
Estimating the Probabilities of Rare Outputs in Language Modelsby Gabriel Wu, Jacob HiltonFirst submitted to…
Hypothesis Testing the Circuit Hypothesis in LLMsby Claudia Shi, Nicolas Beltran-Velez, Achille Nazaret, Carolina Zheng,…
AERO: Softmax-Only LLMs for Efficient Private Inferenceby Nandan Kumar Jha, Brandon ReagenFirst submitted to arxiv…
Cliqueformer: Model-Based Optimization with Structured Transformersby Jakub Grudzien Kuba, Pieter Abbeel, Sergey LevineFirst submitted to…
Context-Scaling versus Task-Scaling in In-Context Learningby Amirhesam Abedsoltan, Adityanarayanan Radhakrishnan, Jingfeng Wu, Mikhail BelkinFirst submitted…
RecurFormer: Not All Transformer Heads Need Self-Attentionby Ruiqing Yan, Linghan Zheng, Xingbo Du, Han Zou,…
Tracking Universal Features Through Fine-Tuning and Model Mergingby Niels Horn, Desmond ElliottFirst submitted to arxiv…