Summary of Atp: Enabling Fast Llm Serving Via Attention on Top Principal Keys, by Yue Niu et al.
ATP: Enabling Fast LLM Serving via Attention on Top Principal Keysby Yue Niu, Saurav Prakash,…
ATP: Enabling Fast LLM Serving via Attention on Top Principal Keysby Yue Niu, Saurav Prakash,…
Orchid: Flexible and Data-Dependent Convolution for Sequence Modelingby Mahdi Karami, Ali GhodsiFirst submitted to arxiv…
Massive Activations in Large Language Modelsby Mingjie Sun, Xinlei Chen, J. Zico Kolter, Zhuang LiuFirst…
Towards Explainability and Fairness in Swiss Judgement Prediction: Benchmarking on a Multilingual Datasetby Santosh T.Y.S.S,…
Deep Learning Approaches for Improving Question Answering Systems in Hepatocellular Carcinoma Researchby Shuning Huo, Yafei…
Evaluating the Performance of ChatGPT for Spam Email Detectionby Shijing Si, Yuwei Wu, Le Tang,…
Dual Encoder: Exploiting the Potential of Syntactic and Semantic for Aspect Sentiment Triplet Extractionby Xiaowei…
Improving Language Understanding from Screenshotsby Tianyu Gao, Zirui Wang, Adithya Bhaskar, Danqi ChenFirst submitted to…
Uncovering Latent Human Wellbeing in Language Model Embeddingsby Pedro Freire, ChengCheng Tan, Adam Gleave, Dan…
A Curious Case of Searching for the Correlation between Training Data and Adversarial Robustness of…