Please use this identifier to cite or link to this item:
http://cmuir.cmu.ac.th/jspui/handle/6653943832/75881
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Papangkorn Inkeaw | en_US |
dc.contributor.author | Piyachat Udomwong | en_US |
dc.contributor.author | Jeerayut Chaijaruwanich | en_US |
dc.date.accessioned | 2022-10-16T07:03:24Z | - |
dc.date.available | 2022-10-16T07:03:24Z | - |
dc.date.issued | 2021-05-23 | en_US |
dc.identifier.issn | 09507051 | en_US |
dc.identifier.other | 2-s2.0-85102878142 | en_US |
dc.identifier.other | 10.1016/j.knosys.2021.106953 | en_US |
dc.identifier.uri | https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85102878142&origin=inward | en_US |
dc.identifier.uri | http://cmuir.cmu.ac.th/jspui/handle/6653943832/75881 | - |
dc.description.abstract | A huge number of labeled samples are required as training data to construct an efficient recognition mechanism for an optical character recognition system. Although samples of characters can be easily collected from available manuscripts, they often lack class labels, especially for ancient and local alphabets. The creation of a training dataset requires a great number of characters manually annotated by experts. It is a costly and time-consuming process. To considerably reduce the human effort required in the construction of training datasets, a novel semi-automatic labeling method is proposed in this work under the assumption that there are no initial labeled samples. The proposed method performs an iterative procedure on a nearest neighbor graph that views samples in multiple feature spaces. In each iteration, an expert is first called upon to label a relevant unlabeled sample that is automatically selected from the highest density area of unlabeled samples. Then, the manually annotated label is propagated to the neighbor samples with safe conditions based on sample density and multi-views. The procedure is repeated until all unlabeled samples are labeled. The labeling procedure of the proposed method is evaluated on MNIST, Devanagari, Thai, and Lanna Dhamma datasets. The results show that the proposed method outperforms state-of-the-art labeling methods, achieving the highest labeling accuracy. In addition, it can handle outlier samples and deal with alphabets that include visually similar characters. Moreover, the recognition performance of the classifier trained by using the semi-automatically generated training dataset is comparable with that classifier trained by actual ground truth. | en_US |
dc.subject | Business, Management and Accounting | en_US |
dc.subject | Computer Science | en_US |
dc.subject | Decision Sciences | en_US |
dc.title | Density based semi-automatic labeling on multi-feature representations for ground truth generation: Application to handwritten character recognition | en_US |
dc.type | Journal | en_US |
article.title.sourcetitle | Knowledge-Based Systems | en_US |
article.volume | 220 | en_US |
article.stream.affiliations | Chiang Mai University | en_US |
Appears in Collections: | CMUL: Journal Articles |
Files in This Item:
There are no files associated with this item.
Items in CMUIR are protected by copyright, with all rights reserved, unless otherwise indicated.