Summary of Cliqueformer: Model-based Optimization with Structured Transformers, by Jakub Grudzien Kuba et al.
Cliqueformer: Model-Based Optimization with Structured Transformersby Jakub Grudzien Kuba, Pieter Abbeel, Sergey LevineFirst submitted to…
Cliqueformer: Model-Based Optimization with Structured Transformersby Jakub Grudzien Kuba, Pieter Abbeel, Sergey LevineFirst submitted to…
An Evolved Universal Transformer Memoryby Edoardo Cetin, Qi Sun, Tianyu Zhao, Yujin TangFirst submitted to…
Estimating the Probabilities of Rare Outputs in Language Modelsby Gabriel Wu, Jacob HiltonFirst submitted to…
Hypothesis Testing the Circuit Hypothesis in LLMsby Claudia Shi, Nicolas Beltran-Velez, Achille Nazaret, Carolina Zheng,…
AERO: Softmax-Only LLMs for Efficient Private Inferenceby Nandan Kumar Jha, Brandon ReagenFirst submitted to arxiv…
Context-Scaling versus Task-Scaling in In-Context Learningby Amirhesam Abedsoltan, Adityanarayanan Radhakrishnan, Jingfeng Wu, Mikhail BelkinFirst submitted…
RecurFormer: Not All Transformer Heads Need Self-Attentionby Ruiqing Yan, Linghan Zheng, Xingbo Du, Han Zou,…
Tracking Universal Features Through Fine-Tuning and Model Mergingby Niels Horn, Desmond ElliottFirst submitted to arxiv…
ExoTST: Exogenous-Aware Temporal Sequence Transformer for Time Series Predictionby Kshitij Tayal, Arvind Renganathan, Xiaowei Jia,…
Enhancing LLM Agents for Code Generation with Possibility and Pass-rate Prioritized Experience Replayby Yuyang Chen,…