Loading Now

Summary of Seventeenth-century Spanish American Notary Records For Fine-tuning Spanish Large Language Models, by Shraboni Sarker et al.


Seventeenth-Century Spanish American Notary Records for Fine-Tuning Spanish Large Language Models

by Shraboni Sarker, Ahmad Tamim Hamad, Hulayyil Alshammari, Viviana Grieco, Praveen Rao

First submitted to arxiv on: 9 Jun 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper presents a valuable resource for fine-tuning large language models (LLMs) developed for the Spanish language to perform various tasks such as classification, masked language modeling, clustering, and others. The authors have created a collection of handwritten notary records from the 17th century obtained from the National Archives of Argentina, which includes original images, transcribed text, and metadata. This resource can be used to fine-tune Spanish LLMs for tasks like classification and masked language modeling, outperforming pre-trained Spanish models and ChatGPT-3.5/ChatGPT-4o. The authors demonstrate the effectiveness of their collection through empirical evaluation.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper is about a new way to teach computers to understand old Spanish documents from the 17th century. They took handwritten records from an archive in Argentina and used them to make language models better at doing tasks like classification and guessing missing words. This will be helpful for people studying history or doing text analysis.

Keywords

» Artificial intelligence  » Classification  » Clustering  » Fine tuning