Loading Now

Summary of Ci-bench: Benchmarking Contextual Integrity Of Ai Assistants on Synthetic Data, by Zhao Cheng et al.


CI-Bench: Benchmarking Contextual Integrity of AI Assistants on Synthetic Data

by Zhao Cheng, Diane Wan, Matthew Abueg, Sahra Ghalebikesabi, Ren Yi, Eugene Bagdasarian, Borja Balle, Stefan Mellem, Shawn O’Banion

First submitted to arxiv on: 20 Sep 2024

Categories

  • Main: Artificial Intelligence (cs.AI)
  • Secondary: None

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
The proposed paper introduces CI-Bench, a comprehensive benchmark for evaluating the ability of AI assistants to protect personal information during model inference. The authors leverage the Contextual Integrity framework to assess information flow across roles, information types, and transmission principles. A novel, scalable data pipeline is presented to generate natural communications, including dialogues and emails, which are used to create 44 thousand test samples across eight domains. Additionally, the paper formulates and evaluates a naive AI assistant to demonstrate the need for further study and careful training towards personal assistant tasks.
Low GrooveSquid.com (original content) Low Difficulty Summary
AI assistants have the potential to perform diverse tasks on behalf of users, but they may also share personal data, raising significant privacy challenges. To address this issue, researchers introduce CI-Bench, a new benchmark that helps evaluate AI assistants’ ability to protect user information. The authors create a large dataset with natural communications and test how well an AI assistant performs in keeping user data private.

Keywords

» Artificial intelligence  » Inference