Summary of Guard: Role-playing to Generate Natural-language Jailbreakings to Test Guideline Adherence Of Large Language Models, by Haibo Jin et al.
GUARD: Role-playing to Generate Natural-language Jailbreakings to Test Guideline Adherence of Large Language Modelsby Haibo…