Loading Now

Summary of Are We Describing the Same Sound? An Analysis Of Word Embedding Spaces Of Expressive Piano Performance, by Silvan David Peter et al.


Are we describing the same sound? An analysis of word embedding spaces of expressive piano performance

by Silvan David Peter, Shreyan Chowdhury, Carlos Eduardo Cancino-Chacón, Gerhard Widmer

First submitted to arxiv on: 31 Dec 2023

Categories

  • Main: Computation and Language (cs.CL)
  • Secondary: Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)

     Abstract of paper      PDF of paper


GrooveSquid.com Paper Summaries

GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!

Summary difficulty Written by Summary
High Paper authors High Difficulty Summary
Read the original abstract here
Medium GrooveSquid.com (original content) Medium Difficulty Summary
This research paper investigates the effectiveness of semantic embeddings in capturing fine-grained nuances in a specific domain: expressive piano performance. The study uses a music research dataset of free text performance characterizations and a follow-up study sorting the annotations into clusters to derive a ground truth for the domain-specific semantic similarity structure. Five embedding models are tested against this ground truth, and their similarity structures are evaluated. Additionally, the paper explores the impact of contextualizing prompts, hubness reduction, cross-modal similarity, and k-means clustering on the performance of these models. The results show that more general models perform better than domain-adapted ones, and the best model configurations achieve human-level agreement.
Low GrooveSquid.com (original content) Low Difficulty Summary
This study looks at how well computer algorithms can understand music descriptions to identify similar styles. It uses a big collection of text about piano performances and groups them into categories to create a standard for what makes two descriptions “similar.” The researchers tested five different ways that computers represent words (called semantic embeddings) and compared them to this standard. They also looked at how adding more context, reducing certain biases, comparing music to other types of media, and grouping similar descriptions together can help or hurt these algorithms’ performance. The results show that using general models rather than ones specific to piano performances works better.

Keywords

» Artificial intelligence  » Clustering  » Embedding  » K means