Summary of Killkan: the Automatic Speech Recognition Dataset For Kichwa with Morphosyntactic Information, by Chihiro Taguchi et al.
Killkan: The Automatic Speech Recognition Dataset for Kichwa with Morphosyntactic Information
by Chihiro Taguchi, Jefferson Saransig, Dayana Velásquez, David Chiang
First submitted to arxiv on: 23 Apr 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary Killkan, a new dataset for automatic speech recognition (ASR) in the Kichwa language, an indigenous language of Ecuador with approximately 4 hours of audio and transcription, is presented. The dataset includes translation into Spanish and morphosyntactic annotation in Universal Dependencies format. The paper also analyzes the agglutinative morphology of Kichwa and code-switching with Spanish. Experiments demonstrate that the dataset enables development of a reliable ASR system for Kichwa despite its small size. The dataset, ASR model, and code will be publicly available. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary Killkan is a special speech recognition dataset for the Kichwa language. Kichwa is an endangered language spoken in Ecuador. There was no speech recognition data for Kichwa before Killkan. This new dataset helps computers understand Kichwa language. The dataset has 4 hours of audio and written words. It also shows how words are related to each other, like a dictionary. Researchers used this data to make a special computer program that can recognize Kichwa speech. |
Keywords
» Artificial intelligence » Translation