Summary of Defining and Evaluating Physical Safety For Large Language Models, by Yung-chen Tang et al.
Defining and Evaluating Physical Safety for Large Language Models
by Yung-Chen Tang, Pin-Yu Chen, Tsung-Yi Ho
First submitted to arxiv on: 4 Nov 2024
Categories
- Main: Machine Learning (cs.LG)
- Secondary: Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Our study examines the physical safety risks of Large Language Models (LLMs) controlling robotic systems like drones. We develop a comprehensive benchmark to evaluate LLM physical safety, categorizing threats into human-targeted, object-targeted, infrastructure attacks, and regulatory violations. Our analysis reveals an undesirable trade-off between utility and safety among mainstream LLMs, with code-generation excelling models often performing poorly in crucial safety aspects. Advanced prompt engineering techniques like In-Context Learning and Chain-of-Thought can improve safety but still struggle to identify unintentional attacks. Larger models demonstrate better safety capabilities, particularly in refusing dangerous commands. Our findings and benchmark facilitate the design and evaluation of physical safety for LLMs. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This study looks at how Large Language Models (LLMs) might be used to control robots like drones. Researchers created a test to see if these models are safe or not, by looking at different types of threats like harming people or breaking things. They found that some LLMs are good at doing certain tasks but bad at keeping people and things safe. The researchers also tried new ways to make the models safer, but they still had problems. They hope their findings will help create safer AI. |
Keywords
» Artificial intelligence » Prompt