Summary of Instruct-tuning Pretrained Causal Language Models For Ancient Greek Papyrology and Epigraphy, by Eric Cullhed
Instruct-Tuning Pretrained Causal Language Models for Ancient Greek Papyrology and Epigraphy
by Eric Cullhed
First submitted to arxiv on: 20 Sep 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This paper presents an experiment in fine-tuning a pre-trained causal language model (Meta’s Llama 3.1 8B Instruct) to assist with restoring missing or illegible characters in ancient Greek inscriptions and documentary papyri. The authors use a straightforward instruction-based approach, achieving impressive results in character error rate (CER), top-1 accuracy, and top-20 accuracy for sequences up to 10 characters. Additionally, the model is fine-tuned for geographic attribution and chronological dating, demonstrating notable accuracy in these tasks as well. Compared to state-of-the-art models, such as Ithaca, the instruction-tuned models excel in text restoration while offering practical advantages like ignoring spaces during reconstruction, aligning with ancient textual artifacts’ scriptio continua. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper shows how artificial intelligence can help fix mistakes in old Greek texts. They used a special kind of AI model to try and correct missing or messy characters in ancient documents. The results are pretty good! They were able to get the corrected text close to being accurate, especially when it comes to simple tasks like fixing short sequences of characters. They also tried using this AI to figure out where the texts came from geographically and when they were written chronologically. While it’s not perfect, the results suggest that this approach could be useful for ancient text restoration. |
Keywords
» Artificial intelligence » Causal language model » Cer » Fine tuning » Llama