Please use this identifier to cite or link to this item:
|Title:||Digital library for Thai astronomical history study on French document resource|
|Abstract:||© 2019 Association for Computing Machinery Thai history in the era of Ayutthaya Kingdom was mostly documented by French missionaries during the 17th-18th centuries. Huge amount of resources in form of manuscripts, books, microfilms are preserved and provided by several institutions such as Bibliotèque National de France, etc. Nowadays, the advance of digital technology allows us to access these resources publicly. Many resources were digitized in form of scanned images. This work aims to establish our own specific digital library for Thai astronomical history study. Document management system was developed. It includes data acquisition and collection management. To be able to access knowledge behind the texts, the scanned images were transformed into machine-readable format by optical characters recognition (OCR). Search engine was implemented to allow historians to find pieces of reverent information from keywords. In our circumstance, Thai historians may not have French reading skill. We integrated an automatic French to English language translation by using machine translation technique. Our system provides the historians the e-books of the French historical original documents in English. To automatically extract knowledge from texts, we perform the natural language processing to identify name-entities, such as name of person, places, events, etc., from texts. This enables the historian to explore some meaningful concepts via the indices of the texts. The indices were also automatically linked to Wikipedia as an existing knowledge pool. There are still some limitations of our project including the processes of OCR, language machine translation, name-entity recognition which remain challenged in computer science research.|
|Appears in Collections:||CMUL: Journal Articles|
Files in This Item:
There are no files associated with this item.
Items in CMUIR are protected by copyright, with all rights reserved, unless otherwise indicated.