Loading Now

Summary of Evaluating the Accuracy Of Chatbots in Financial Literature, by Orhan Erdem et al.


Evaluating the Accuracy of Chatbots in Financial Literature

by Orhan Erdem, Kristi Hassett, Feyzullah Egriboyun

First submitted to arxiv on: 11 Nov 2024

Categories

  • Main: Artificial Intelligence (cs.AI)
  • Secondary: Econometrics (econ.EM)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This research evaluates the reliability of three chatbots – ChatGPT (4o and o1-preview versions) and Gemini Advanced – in providing references on financial literature. The study employs novel methodologies to assess how hallucination rates vary with topic recency. The results show that ChatGPT-4o had a hallucination rate of 20.0%, while the o1-preview had 21.3%. In contrast, Gemini Advanced exhibited higher hallucination rates at 76.7%. The findings highlight the importance of verifying chatbot-provided references, especially in rapidly evolving fields.
Low GrooveSquid.com (original content) Low Difficulty Summary
This study looks at how well three chatbots – ChatGPT and Gemini Advanced – do when providing references on financial topics. Researchers used special methods to see if these chatbots get more wrong as they talk about newer topics. They found that one version of ChatGPT got 20% of its references wrong, another version got 21%, and Gemini got a much higher 77%! The study shows how important it is to double-check what these chatbot say, especially when talking about new and changing information.

Keywords

» Artificial intelligence  » Gemini  » Hallucination