Loading Now

Summary of Measuring Moral Inconsistencies in Large Language Models, by Vamshi Krishna Bonagiri et al.


Measuring Moral Inconsistencies in Large Language Models

by Vamshi Krishna Bonagiri, Sreeram Vennam, Manas Gaur, Ponnurangam Kumaraguru

First submitted to arxiv on: 26 Jan 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
Recent advancements in Large Language Models (LLMs) have showcased impressive capabilities in conversational systems, but these models are highly inconsistent in their generations, raising concerns about reliability. This inconsistency has been measured using task-specific accuracy, but this approach is unsuitable for moral scenarios like the trolley problem with no “correct” answer. To address this issue, we propose Semantic Graph Entropy (SGE), a novel information-theoretic measure to quantify LLM consistency in moral scenarios. We leverage “Rules of Thumb” (RoTs) to explain models’ decision-making strategies and enhance our metric. Compared to existing metrics, SGE correlates better with human judgments across five LLMs.
Low GrooveSquid.com (original content) Low Difficulty Summary
Large Language Models are very good at talking like humans. But did you know that even the best ones don’t always give the same answer if you ask them a question in a slightly different way? This is called inconsistency, and it’s a big problem because we can’t trust what they’re saying. To fix this, scientists have created a new way to measure how consistent these models are. They call it Semantic Graph Entropy. It helps us understand why the models make certain decisions. The good news is that this new way of measuring consistency works better than others did.

Keywords

* Artificial intelligence