Summary of Defining and Evaluating Physical Safety For Large Language Models, by Yung-chen Tang et al.

Defining and Evaluating Physical Safety for Large Language Models

by Yung-Chen Tang, Pin-Yu Chen, Tsung-Yi Ho

First submitted to arxiv on: 4 Nov 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary Our study examines the physical safety risks of Large Language Models (LLMs) controlling robotic systems like drones. We develop a comprehensive benchmark to evaluate LLM physical safety, categorizing threats into human-targeted, object-targeted, infrastructure attacks, and regulatory violations. Our analysis reveals an undesirable trade-off between utility and safety among mainstream LLMs, with code-generation excelling models often performing poorly in crucial safety aspects. Advanced prompt engineering techniques like In-Context Learning and Chain-of-Thought can improve safety but still struggle to identify unintentional attacks. Larger models demonstrate better safety capabilities, particularly in refusing dangerous commands. Our findings and benchmark facilitate the design and evaluation of physical safety for LLMs.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This study looks at how Large Language Models (LLMs) might be used to control robots like drones. Researchers created a test to see if these models are safe or not, by looking at different types of threats like harming people or breaking things. They found that some LLMs are good at doing certain tasks but bad at keeping people and things safe. The researchers also tried new ways to make the models safer, but they still had problems. They hope their findings will help create safer AI.

Keywords

» Artificial intelligence » Prompt

Defining and Evaluating Physical Safety for Large Language Models

by Yung-Chen Tang, Pin-Yu Chen, Tsung-Yi Ho

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Combining Induction and Transduction For Abstract Reasoning, by Wen-ding Li et al.

Summary of Madod: Generalizing Ood Detection to Unseen Domains Via G-invariance Meta-learning, by Haoliang Wang et al.

Related Posts