Loading Now

Summary of Elements Of World Knowledge (ewok): a Cognition-inspired Framework For Evaluating Basic World Knowledge in Language Models, by Anna A. Ivanova et al.


Elements of World Knowledge (EWOK): A cognition-inspired framework for evaluating basic world knowledge in language models

by Anna A. Ivanova, Aalok Sathe, Benjamin Lipkin, Unnathi Kumar, Setayesh Radkani, Thomas H. Clark, Carina Kauf, Jennifer Hu, R.T. Pramod, Gabriel Grand, Vivian Paulun, Maria Ryskina, Ekin Akyürek, Ethan Wilcox, Nafisa Rashid, Leshem Choshen, Roger Levy, Evelina Fedorenko, Joshua Tenenbaum, Jacob Andreas

First submitted to arxiv on: 15 May 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
A new framework called Elements of World Knowledge (EWOK) has been developed to evaluate the world modeling abilities of language models. This framework tests a model’s ability to use knowledge of a concept to match a target text with a plausible or implausible context. EWOK targets specific concepts from multiple knowledge domains, including social interactions and spatial relations, using minimal pairs with flexible objects, agents, and locations. The authors also introduce the EWOK-CORE-1.0 dataset, which contains 4,374 items covering 11 world knowledge domains. They evaluate 20 large language models (LLMs) across various evaluation paradigms, including a human norming study comprising 12,480 measurements. The results show that even large LLMs perform worse than humans in certain domains, highlighting the need for targeted research on their world modeling capabilities.
Low GrooveSquid.com (original content) Low Difficulty Summary
World modeling is crucial for general-purpose AI agents, but testing such capabilities is challenging due to unclear building blocks of world models. A new framework called Elements of World Knowledge (EWOK) helps evaluate language models’ abilities by matching target texts with plausible or implausible contexts based on specific concepts from various knowledge domains. The authors created a dataset of 4,374 items covering 11 domains and tested 20 large language models. The results show that even large models struggle in certain areas, making targeted research necessary.

Keywords

» Artificial intelligence