Summary of Do Llms Dream Of Ontologies?, by Marco Bombieri et al.

Do LLMs Dream of Ontologies?

by Marco Bombieri, Paolo Fiorini, Simone Paolo Ponzetto, Marco Rospocher

First submitted to arxiv on: 26 Jan 2024

GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty	Written by	Summary
High	Paper authors	High Difficulty Summary Read the original abstract here
Medium	GrooveSquid.com (original content)	Medium Difficulty Summary This paper investigates how large language models (LLMs) retain structured knowledge from publicly available ontologies. Specifically, it explores whether general-purpose pre-trained LLMs like Pythia-12B, Gemini-1.5-Flash, GPT-3.5, and GPT-4 can accurately reproduce concept identifier-label associations from resources such as the Gene Ontology, Uberon, Wikidata, and ICD-10. The study finds that only a small fraction of ontological concepts is accurately memorized, with GPT-4 performing the best. To understand why some concepts are more easily memorized than others, the authors analyze the relationship between concept popularity on the Web and memorization accuracy. They find a strong correlation between online frequency and ID retrieval likelihood, suggesting that LLMs primarily acquire knowledge through indirect textual exposure rather than direct ontology access. The paper also introduces new metrics for predicting invariance and robustness of model responses.
Low	GrooveSquid.com (original content)	Low Difficulty Summary This research looks at how well large language models remember information from special dictionaries called ontologies. Ontologies help us understand the meaning of words and concepts. The scientists tested four different types of language models to see if they could correctly link labels to concept IDs (like a phonebook). They found that only some parts of these dictionaries are remembered well, and one type of model did better than others. To figure out why some things are easier to remember than others, the researchers looked at how often concepts appear on the internet. They found that if something is mentioned often online, it’s more likely to be correctly linked in the language model’s memory. This means that language models mostly learn from reading text, not directly from the special dictionaries.

Keywords

* Artificial intelligence * Gemini * Gpt * Language model * Likelihood

Do LLMs Dream of Ontologies?

by Marco Bombieri, Paolo Fiorini, Simone Paolo Ponzetto, Marco Rospocher

Categories

GrooveSquid.com Paper Summaries

Keywords

Summary of Enhancing Diagnostic Accuracy Through Multi-agent Conversations: Using Large Language Models to Mitigate Cognitive Bias, by Yu He Ke et al.

Summary of A Rag-based Question Answering System Proposal For Understanding Islam: Mufassirqas Llm, by Ahmet Yusuf Alan et al.

Related Posts