Summary of Measuring Social Norms Of Large Language Models, by Ye Yuan et al.

by Ye Yuan, Kexin Tang, Jianhao Shen, Ming Zhang, Chenguang Wang

First submitted to arxiv on: 3 Apr 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary The proposed dataset challenges large language models to demonstrate an understanding of social norms by requiring a fundamental comprehension of these skills. The dataset features 402 skills and 12,383 questions covering various social norms, designed in accordance with the K-12 curriculum to facilitate direct comparison with human performance, particularly that of elementary students. Large language models like GPT3.5-Turbo and LLaMA2-Chat show significant improvements on this benchmark, with results only slightly below those achieved by humans. To further enhance these models’ ability to understand social norms, a multi-agent framework is proposed, leading to parity with human performance.
Low	GrooveSquid.com (original content)	Low Difficulty Summary Large language models are being used in many real-world applications, and it’s important for them to understand social norms. A new dataset challenges these models to show they can do this by asking questions about different social norms like opinions, arguments, culture, and laws. The dataset is designed so that it’s easy to compare the performance of large language models with that of humans, specifically elementary school students. Surprisingly, some recent large language models are able to answer these questions almost as well as humans do! To make these models even better, a new way of using multiple agents is proposed.

Keywords

* Artificial intelligence

Measuring Social Norms of Large Language Models

by Ye Yuan, Kexin Tang, Jianhao Shen, Ming Zhang, Chenguang Wang

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of On the Efficiency and Robustness Of Vibration-based Foundation Models For Iot Sensing: a Case Study, by Tomoyoshi Kimura et al.

Summary of New Methods For Drug Synergy Prediction: a Mini-review, by Fatemeh Abbasi and Juho Rousu

Related Posts