Please use this identifier to cite or link to this item:
http://cmuir.cmu.ac.th/jspui/handle/6653943832/80098
Title: | Motion recognition for Chinese isolated word sign language based on deep learning method |
Other Titles: | การรู้จำการเคลื่อนไหวสำหรับภาษามือแบบคำเอกเทศภาษาจีนบนพื้นฐานของวิธีการเรียนรู้เชิงลึก |
Authors: | Huang, Jiayu |
Authors: | Varin Chouvatut Huang, Jiayu |
Issue Date: | 4-Sep-2024 |
Publisher: | Chiang Mai : Graduate School, Chiang Mai University |
Abstract: | As society continues to develop, the number of deaf and hard-of-hearing individuals has been increasing. As a primary mode of communication, sign language plays a vital role in facilitating daily interactions for these individuals, making sign language recognition increasingly important. In this context, artificial intelligence and deep learning have introduced new opportunities and challenges in the field of sign language recognition. Building on existing research, this thesis provides a summary and analysis of commonly used recognition algorithms and neural network models, focusing on isolated words. This thesis identifies and analyzes the issues and challenges associated with sign language video recognition. Given that sign language videos involve vast amounts of data, their processing is resource-intensive. To address this, this thesis proposes a fusion of Residual Networks (ResNet) and Long Short-Term Memory networks (LSTM) tailored for practical considerations. This approach includes detailed preprocessing of sign language videos, model pre-training, feature extraction, and classification, demonstrating strong recognition accuracy on the Chinese Sign Language and the Argentine Sign Language (LSA64) datasets. By recognizing the spatiotemporal characteristics of video data, this thesis further proposes a fusion of R(2+1)D and LSTM networks. This thesis also discusses the advantages and disadvantages of the R(2+1)D networks. It details the feature extraction process for sign language videos, with LSTM networks playing a key role in extracting long-sequence features. The experiments on the CSL and the LSA64 datasets reveal high recognition accuracies up to 96.21% and 99.69%, respectively. |
URI: | http://cmuir.cmu.ac.th/jspui/handle/6653943832/80098 |
Appears in Collections: | SCIENCE: Theses |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
jiayu-huang 630531023.pdf | 2.32 MB | Adobe PDF | View/Open Request a copy |
Items in CMUIR are protected by copyright, with all rights reserved, unless otherwise indicated.