Summary of Transformer Normalisation Layers and the Independence Of Semantic Subspaces, by Stephen Menary et al.
Transformer Normalisation Layers and the Independence of Semantic Subspacesby Stephen Menary, Samuel Kaski, Andre FreitasFirst…
Transformer Normalisation Layers and the Independence of Semantic Subspacesby Stephen Menary, Samuel Kaski, Andre FreitasFirst…
CaLMQA: Exploring culturally specific long-form question answering across 23 languagesby Shane Arora, Marzena Karpinska, Hung-Ting…
Understanding and Mitigating Tokenization Bias in Language Modelsby Buu Phan, Marton Havasi, Matthew Muckley, Karen…
From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Modelsby Sean Welleck, Amanda Bertsch, Matthew…
Token-based Decision Criteria Are Suboptimal in In-context Learningby Hakaze Cho, Yoshihiro Sakai, Mariko Kato, Kenshiro…
Confidence Regulation Neurons in Language Modelsby Alessandro Stolfo, Ben Wu, Wes Gurnee, Yonatan Belinkov, Xingyi…
ReCaLL: Membership Inference via Relative Conditional Log-Likelihoodsby Roy Xie, Junlin Wang, Ruomin Huang, Minxing Zhang,…
Semantic Entropy Probes: Robust and Cheap Hallucination Detection in LLMsby Jannik Kossen, Jiatong Han, Muhammed…
SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attentionby Qianchao Zhu,…
Multi-View Empowered Structural Graph Wordification for Language Modelsby Zipeng Liu, Likang Wu, Ming He, Zhong…