Summary of Exploring and Steering the Moral Compass Of Large Language Models, by Alejandro Tlaie
Exploring and steering the moral compass of Large Language Modelsby Alejandro TlaieFirst submitted to arxiv…
Exploring and steering the moral compass of Large Language Modelsby Alejandro TlaieFirst submitted to arxiv…
GECKO: Generative Language Model for English, Code and Koreanby Sungwoo Oh, Donggyu KimFirst submitted to…
Benchmarking the Performance of Pre-trained LLMs across Urdu NLP Tasksby Munief Hassan Tahir, Sana Shams,…
Evaluating Large Language Models with Human Feedback: Establishing a Swedish Benchmarkby Birger MoellFirst submitted to…
Your Large Language Models Are Leaving Fingerprintsby Hope McGovern, Rickard Stureborg, Yoshi Suhara, Dimitris AlikaniotisFirst…
Enhancing Dialogue State Tracking Models through LLM-backed User-Agents Simulationby Cheng Niu, Xingguang Wang, Xuxin Cheng,…
An Assessment of Model-On-Model Deceptionby Julius Heitkoetter, Michael Gerovitch, Laker NewhouseFirst submitted to arxiv on:…
Tiny Refinements Elicit Resilience: Toward Efficient Prefix-Model Against LLM Red-Teamingby Jiaxu Liu, Xiangyu Yin, Sihao…
IGOT: Information Gain Optimized Tokenizer on Domain Adaptive Pretrainingby Dawei Feng, Yihai Zhang, Zhixuan XuFirst…
PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetitionby Ziyang Zhang, Qizhen Zhang, Jakob…