Please use this identifier to cite or link to this item: http://cmuir.cmu.ac.th/jspui/handle/6653943832/79213
Title: การพัฒนาออนโทโลยีโควิด-19 จากแหล่งข้อมูลที่หลากหลายโดยใช้การวิเคราะห์ข้อความ
Other Titles: Development of COVID-19 ontology from multiple data sources using text analytics
Authors: ปฏิพน เวียงนาค
Authors: อารีรัตน์ ตรงรัศมีทอง
ปฏิพน เวียงนาค
Issue Date: Sep-2023
Publisher: เชียงใหม่ : บัณฑิตวิทยาลัย มหาวิทยาลัยเชียงใหม่
Abstract: Gathering information from multiple data sources takes a long time to collect, analyze and classify. Furthermore, if the data sources have different data structures, the merged data structure must be able to support such heterogeneity. In addition, semantics must also be considered. This paper proposes automated knowledge integration from heterogeneous data sources, using ontology engineering combined with text analytics. Text stemming is used to preprocess data. Part-of-speech (POS) tagging, Universal Dependencies (UD), and text similarity measurement called cosine similarity are used to analyze and integrate data. The knowledge scopes focus on five perspectives of COVID-19 information: COVID-19, Coronavirus, disease, pandemic, and vaccine. For evaluation, six ontologies were constructed using cosine similarity measurement ranged from 0.5 to 1.0. The data used in each ontology construction contain data related and unrelated to COVID-19 in a ratio of 70 to 30. The six constructed ontologies were evaluated for consistency with the original data. Using cosine similarity with 0.6, precision, recall, and F1-score are 0.80, 0.70, and 0.75, respectively, and the constructed ontology is optimal containing the highest amount of relevant COVID-19 information for this case study.
URI: http://cmuir.cmu.ac.th/jspui/handle/6653943832/79213
Appears in Collections:SCIENCE: Independent Study (IS)

Files in This Item:
File Description SizeFormat 
630532005-ปฏิพน เวียงนาค.pdf12.67 MBAdobe PDFView/Open    Request a copy


Items in CMUIR are protected by copyright, with all rights reserved, unless otherwise indicated.