Summary of Mero Nagarikta: Advanced Nepali Citizenship Data Extractor with Deep Learning-powered Text Detection and Ocr, by Sisir Dhakal et al.
Mero Nagarikta: Advanced Nepali Citizenship Data Extractor with Deep Learning-Powered Text Detection and OCR
by Sisir Dhakal, Sujan Sigdel, Sandesh Prasad Paudel, Sharad Kumar Ranabhat, Nabin Lamichhane
First submitted to arxiv on: 8 Oct 2024
Categories
- Main: Computer Vision and Pattern Recognition (cs.CV)
- Secondary: Artificial Intelligence (cs.AI)
GrooveSquid.com Paper Summaries
GrooveSquid.com’s goal is to make artificial intelligence research accessible by summarizing AI papers in simpler terms. Each summary below covers the same AI paper, written at different levels of difficulty. The medium difficulty and low difficulty versions are original summaries written by GrooveSquid.com, while the high difficulty version is the paper’s original abstract. Feel free to learn from the version that suits you best!
Summary difficulty | Written by | Summary |
---|---|---|
High | Paper authors | High Difficulty Summary Read the original abstract here |
Medium | GrooveSquid.com (original content) | Medium Difficulty Summary The paper proposes a robust system to transform Nepali citizenship cards into a structured digital format using YOLOv8 and Optimized PyTesseract. The system, implemented in a mobile application, extracts important textual information from both card sides, including names, citizenship numbers, and dates of birth. The YOLOv8 model achieved high accuracy (99.1% on the front and 96.1% on the back) for text detection. The optimized PyTesseract outperformed standard OCR regarding flexibility and accuracy, extracting text from images with clean and noisy backgrounds and various contrasts. Preprocessing steps such as grayscale conversion, noise removal, and edge detection further improved OCR accuracy even for low-quality photos. This work contributes to multilingual OCR and document analysis research, emphasizing the effectiveness of combining object detection and OCR models fine-tuned for practical applications. |
Low | GrooveSquid.com (original content) | Low Difficulty Summary This paper is about a system that can automatically read information from Nepali citizenship cards. The system uses special computer vision techniques to detect text on the cards and then reads it accurately. This is important because Nepali citizenship cards have unique characteristics that make them hard to read using regular computer systems. The system is designed for use in a mobile app, making it easy to access the information anywhere. The paper shows that the system works well even when the card images are low-quality or noisy. |
Keywords
» Artificial intelligence » Object detection