Training set size reduction in large dataset problems

Varin Chouvatut; Wattana Jindaluang; Ekkarat Boonchieng

Please use this identifier to cite or link to this item: http://cmuir.cmu.ac.th/jspui/handle/6653943832/55533

Title:	Training set size reduction in large dataset problems
Authors:	Varin Chouvatut Wattana Jindaluang Ekkarat Boonchieng
Authors:	Varin Chouvatut Wattana Jindaluang Ekkarat Boonchieng
Keywords:	Computer Science;Decision Sciences
Issue Date:	8-Feb-2016
Abstract:	© 2015 IEEE. Classifiers have known to be used in various fields of applications. However, the main problem usually found recently is about applying a classifier to large datasets. Thus, the process of reducing size of the training set becomes necessary especially to accelerate the processing time of the classifier. Concerning the problem, this paper proposes a new method which can reduce size of the training set in a large dataset. Our proposed method is improved from a famous graph-based algorithm named Optimum-Path Forest (OPF). Our principal concept of reducing the training set's size is to utilize the Segmented Least Square Algorithm (SLSA) in estimating the tree's shape. From the experimental results, our proposed method could reduce size of the training set by about 7 to 21 percent comparing with the original OPF algorithm while the classification's accuracy decreased insignificantly by only about 0.2 to 0.5 percent. In addition, for some datasets, our method provided even as same degree of accuracy as of the original OPF algorithm.
URI:	https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84964320834&origin=inward http://cmuir.cmu.ac.th/jspui/handle/6653943832/55533
Appears in Collections:	CMUL: Journal Articles

Files in This Item:

There are no files associated with this item.

Show full item record