Summary of Sycophancy to Subterfuge: Investigating Reward-tampering in Large Language Models, by Carson Denison et al.
Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Modelsby Carson Denison, Monte MacDiarmid, Fazl Barez,…
Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Modelsby Carson Denison, Monte MacDiarmid, Fazl Barez,…
OSPC: Detecting Harmful Memes with Large Language Model as a Catalystby Jingtao Cao, Zheng Zhang,…
An Approach to Build Zero-Shot Slot-Filling System for Industry-Grade Conversational Assistantsby G P Shrivatsa Bhargav,…
Towards Reliable Detection of LLM-Generated Texts: A Comprehensive Evaluation Framework with CUDRTby Zhen Tao, Yanfang…
Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQLby Zijin Hong, Zheng Yuan, Qinggang Zhang, Hao…
Reversing the Forget-Retain Objectives: An Efficient LLM Unlearning Framework from Logit Differenceby Jiabao Ji, Yujian…
Let’s Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversationby Se Jin Park, Chae Won…
Multimodal Table Understandingby Mingyu Zheng, Xinwei Feng, Qingyi Si, Qiaoqiao She, Zheng Lin, Wenbin Jiang,…
MobileAgentBench: An Efficient and User-Friendly Benchmark for Mobile LLM Agentsby Luyuan Wang, Yongyu Deng, Yiwei…
Beyond Bare Queries: Open-Vocabulary Object Grounding with 3D Scene Graphby Sergey Linok, Tatiana Zemskova, Svetlana…