Summary of Confidence Regulation Neurons in Language Models, by Alessandro Stolfo et al.
Confidence Regulation Neurons in Language Modelsby Alessandro Stolfo, Ben Wu, Wes Gurnee, Yonatan Belinkov, Xingyi…
Confidence Regulation Neurons in Language Modelsby Alessandro Stolfo, Ben Wu, Wes Gurnee, Yonatan Belinkov, Xingyi…
PostMark: A Robust Blackbox Watermark for Large Language Modelsby Yapei Chang, Kalpesh Krishna, Amir Houmansadr,…
Knowledge Distillation in Federated Learning: a Survey on Long Lasting Challenges and New Solutionsby Laiqiao…
DALD: Improving Logits-based Detector without Logits from Black-box LLMsby Cong Zeng, Shengkun Tang, Xianjun Yang,…
Spread Preference Annotation: Direct Preference Judgment for Efficient LLM Alignmentby Dongyoung Kim, Kimin Lee, Jinwoo…
CondTSF: One-line Plugin of Dataset Condensation for Time Series Forecastingby Jianrong Ding, Zhanyu Liu, Guanjie…
Confidence-Based Task Prediction in Continual Disease Classification Using Probability Distributionby Tanvi Verma, Lukas Schwemer, Mingrui…
MANO: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shiftsby Renchunzi Xie, Ambroise Odonnat,…
Automatically Identifying Local and Global Circuits with Linear Computation Graphsby Xuyang Ge, Fukang Zhu, Wentao…
Exploring Dark Knowledge under Various Teacher Capacities and Addressing Capacity Mismatchby Xin-Chun Li, Wen-Shu Fan,…