Summary of A Transformer-based Multi-stream Approach For Isolated Iranian Sign Language Recognition, by Ali Ghadami et al.
A Transformer-Based Multi-Stream Approach for Isolated Iranian Sign Language Recognition
by Ali Ghadami, Alireza Taheri, Ali Meghdari
First submitted to arxiv on: 27 Jun 2024
Categories
- Main: Computation and Language (cs.CL)
- Secondary: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper presents a deep learning-based sign language recognition system for Iranian Sign Language (ISL) words. This system aims to bridge the communication gap between the deaf and hard of hearing community and the broader public. The proposed network combines early fusion and late fusion transformer encoder-based networks optimized using a genetic algorithm. The model is trained on 101 ISL word dataset frequently used in academic environments, extracted from sign videos using hands and lips key points, distance, and angle features. Additionally, the paper employs multi-task learning by utilizing word embedding vectors for smoother training. The developed model achieves 90.2% accuracy on test data and is applied to a sign language training software providing real-time feedback. This study investigates the effectiveness and efficiency of this type of sign language learning software and its impact on feedback. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is about creating a system that can recognize Iranian Sign Language (ISL) words. The goal is to help people who use ISL as their main way of communicating to connect better with others. The researchers used special computer tools called transformers to develop this recognition system. They trained the system using 101 common ISL words from universities, and it was able to recognize these words with high accuracy (90.2%). This technology can be used to create a sign language learning software that provides feedback in real-time, which is important for effective learning. |
Keywords
» Artificial intelligence » Deep learning » Embedding » Encoder » Multi task » Transformer