Please use this identifier to cite or link to this item: http://cmuir.cmu.ac.th/jspui/handle/6653943832/63997
Full metadata record
DC FieldValueLanguage
dc.contributor.authorThawatchai Suwanapongen_US
dc.contributor.authorThanaruk Theeramunkongen_US
dc.contributor.authorEkawit Nantajeewarawaten_US
dc.date.accessioned2019-05-07T09:59:42Z-
dc.date.available2019-05-07T09:59:42Z-
dc.date.issued2017en_US
dc.identifier.issn0125-2526en_US
dc.identifier.urihttp://it.science.cmu.ac.th/ejournal/dl.php?journal_id=8506en_US
dc.identifier.urihttp://cmuir.cmu.ac.th/jspui/handle/6653943832/63997-
dc.description.abstractNamed entity disambiguation is one of the most challenging tasks in natural language processing. In many Thai news categories, referential ambiguity is often found, i.e., in addition to its formal names, an entity is often referred to by other names, called name aliases. Name co-occurrence information is very useful for name-alias relationship identification, and it is usually represented by a co-occurrence matrix in the vector space model. Traditionally, a co-occurrence matrix is constructed by multiplying a weighted name-by-document matrix, possibly normalized, and its transpose. This paper proposes an alternative co-occurrence matrix construction method using association measures. The effects of association measures are investigated by comparing their use with the traditional co-occurrence matrix construction method. Various complementary factors are considered in the comparison, e.g., weighting schemes, a normalization process, and linkage functions for hierarchical clustering. Two collections of Thai news articles, 1,000 articles in the domain of football and 1,000 articles in the domain of politics, are used in experiments. The experimental results show that co-occurrence matrix construction using association measures yields the highest performance in both news domains.en_US
dc.languageEngen_US
dc.publisherScience Faculty of Chiang Mai Universityen_US
dc.titleName-alias Relationship Identification in Thai News Articles: A Comparison of Co-occurrence Matrix Construction Methodsen_US
dc.typeบทความวารสารen_US
article.title.sourcetitleChiang Mai Journal of Scienceen_US
article.volume44en_US
article.stream.affiliationsSchool of Information, Computer and Communication Technology, Sirindhorn International Institute of Technology, Thammasat University, Thailand.en_US
Appears in Collections:CMUL: Journal Articles

Files in This Item:
There are no files associated with this item.


Items in CMUIR are protected by copyright, with all rights reserved, unless otherwise indicated.