Summary of Humor in Ai: Massive Scale Crowd-sourced Preferences and Benchmarks For Cartoon Captioning, by Jifan Zhang et al.
Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioningby Jifan Zhang, Lalit…
Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioningby Jifan Zhang, Lalit…
AI Sandbagging: Language Models can Strategically Underperform on Evaluationsby Teun van der Weij, Felix Hofstätter,…
Decision-Making Behavior Evaluation Framework for LLMs under Uncertain Contextby Jingru Jia, Zehua Yuan, Junhao Pan,…
Alice in Wonderland: Simple Tasks Showing Complete Reasoning Breakdown in State-Of-the-Art Large Language Modelsby Marianna…
MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Seriesby Ge Zhang, Scott Qu, Jiaheng…
WordGame: Efficient & Effective LLM Jailbreak via Simultaneous Obfuscation in Query and Responseby Tianrong Zhang,…
Topic Classification of Case Law Using a Large Language Model and a New Taxonomy for…
A Multi-Faceted Evaluation Framework for Assessing Synthetic Data Generated by Large Language Modelsby Yefeng Yuan,…
OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Databy Chandeepa Dissanayake, Lahiru…
Assessing Economic Viability: A Comparative Analysis of Total Cost of Ownership for Domain-Adapted Large Language…