Summary of Are We Describing the Same Sound? An Analysis Of Word Embedding Spaces Of Expressive Piano Performance, by Silvan David Peter et al.
Are we describing the same sound? An analysis of word embedding spaces of expressive piano performance
by Silvan David Peter, Shreyan Chowdhury, Carlos Eduardo Cancino-Chacón, Gerhard Widmer
First submitted to arxiv on: 31 Dec 2023
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary This research paper investigates the effectiveness of semantic embeddings in capturing fine-grained nuances in a specific domain: expressive piano performance. The study uses a music research dataset of free text performance characterizations and a follow-up study sorting the annotations into clusters to derive a ground truth for the domain-specific semantic similarity structure. Five embedding models are tested against this ground truth, and their similarity structures are evaluated. Additionally, the paper explores the impact of contextualizing prompts, hubness reduction, cross-modal similarity, and k-means clustering on the performance of these models. The results show that more general models perform better than domain-adapted ones, and the best model configurations achieve human-level agreement. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This study looks at how well computer algorithms can understand music descriptions to identify similar styles. It uses a big collection of text about piano performances and groups them into categories to create a standard for what makes two descriptions “similar.” The researchers tested five different ways that computers represent words (called semantic embeddings) and compared them to this standard. They also looked at how adding more context, reducing certain biases, comparing music to other types of media, and grouping similar descriptions together can help or hurt these algorithms’ performance. The results show that using general models rather than ones specific to piano performances works better. |
Keywords
» Artificial intelligence » Clustering » Embedding » K means