Summary of Stealthy Jailbreak Attacks on Large Language Models Via Benign Data Mirroring, by Honglin Mu et al.
Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroringby Honglin Mu, Han He,…
Stealthy Jailbreak Attacks on Large Language Models via Benign Data Mirroringby Honglin Mu, Han He,…
Belief in the Machine: Investigating Epistemological Blind Spots of Language Modelsby Mirac Suzgun, Tayfun Gur,…
Gender Bias in LLM-generated Interview Responsesby Haein Kong, Yongsu Ahn, Sangyub Lee, Yunho MaengFirst submitted…
Think Carefully and Check Again! Meta-Generation Unlocking LLMs for Low-Resource Cross-Lingual Summarizationby Zhecheng Li, Yiwei…
LocateBench: Evaluating the Locating Ability of Vision Language Modelsby Ting-Rui Chiang, Joshua Robinson, Xinyan Velocity…
Integrating Large Language Models with Internet of Things Applicationsby Mingyu Zong, Arvin Hekmati, Michael Guastalla,…
LOGO – Long cOntext aliGnment via efficient preference Optimizationby Zecheng Tang, Zechen Sun, Juntao Li,…
Little Giants: Synthesizing High-Quality Embedding Data at Scaleby Haonan Chen, Liang Wang, Nan Yang, Yutao…
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratchby Yuyang Ding, Xinyu Shi,…
CLR-Bench: Evaluating Large Language Models in College-level Reasoningby Junnan Dong, Zijin Hong, Yuanchen Bei, Feiran…