A combined template-based and case-based metadata extraction for heterogeneous thai documents

Krisda Khankasikam; Nopasit Chakpitak; Thana Udomsripaiboon

Please use this identifier to cite or link to this item: http://cmuir.cmu.ac.th/jspui/handle/6653943832/59515

Full metadata record

DC Field	Value	Language
dc.contributor.author	Krisda Khankasikam	en_US
dc.contributor.author	Nopasit Chakpitak	en_US
dc.contributor.author	Thana Udomsripaiboon	en_US
dc.date.accessioned	2018-09-10T03:16:30Z	-
dc.date.available	2018-09-10T03:16:30Z	-
dc.date.issued	2009-04-24	en_US
dc.identifier.other	2-s2.0-64949203967	en_US
dc.identifier.other	10.1109/ICACC.2009.88	en_US
dc.identifier.uri	https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=64949203967&origin=inward	en_US
dc.identifier.uri	http://cmuir.cmu.ac.th/jspui/handle/6653943832/59515	-
dc.description.abstract	Nowadays, a number of universities, laboratories, government agencies and companies that placing theirs documents online and making them searchable are increasing because the Internet infrastructure for global data access is fully functional. However, a large number of organizations have documents that lack metadata. The lack of metadata breaks off not only the discovery and dissemination of these documents over the Internet, but also their connectivity with other documents. Unfortunately, manual metadata extraction is expensive and time-consuming for a large document, and most existing automated metadata extraction approaches have focused on specific domains and homogeneous documents. In this paper, we propose a combined cased-based and template-based metadata extraction approach to solve these issues. The key idea of solving the heterogeneity is to classify documents into equivalent groups so that each document group contains similar documents only. Next, for each document group we have a template of previous case that contains a process to extract metadata from documents in the group. © 2008 IEEE.	en_US
dc.subject	Computer Science	en_US
dc.subject	Engineering	en_US
dc.title	A combined template-based and case-based metadata extraction for heterogeneous thai documents	en_US
dc.type	Conference Proceeding	en_US
article.title.sourcetitle	Proceedings - International Conference on Advanced Computer Control, ICACC 2009	en_US
article.stream.affiliations	Naresuan University	en_US
article.stream.affiliations	Chiang Mai University	en_US
Appears in Collections:	CMUL: Journal Articles

Files in This Item:

There are no files associated with this item.

Show simple item record