Loading Now

Summary of Exploring Fine-tuned Generative Models For Keyphrase Selection: a Case Study For Russian, by Anna Glazkova and Dmitry Morozov


Exploring Fine-tuned Generative Models for Keyphrase Selection: A Case Study for Russian

by Anna Glazkova, Dmitry Morozov

First submitted to arxiv on: 16 Sep 2024

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This paper explores the application of fine-tuned generative transformer-based models for keyphrase selection within Russian scientific texts. The authors experiment with four distinct models – ruT5, ruGPT, mT5, and mBART – evaluating their performance on Russian scientific abstracts from four domains: mathematics & computer science, history, medicine, and linguistics. The results show that the use of generative models, specifically mBART, can lead to significant gains in in-domain performance, outperforming keyphrase extraction baselines for the Russian language. While cross-domain performance is lower, it still demonstrates potential for further exploration and refinement.
Low GrooveSquid.com (original content) Low Difficulty Summary
This paper helps computers better understand scientific texts from Russia. Scientists use special models called generative transformer-based models to find important phrases within these texts. They tested four different models on four types of text: math and computer science, history, medicine, and language. The best model, mBART, worked really well at finding keyphrases within the same type of text it was trained on. While it didn’t work as well on different types of text, it still showed promise for future research.

Keywords

» Artificial intelligence  » Transformer