Logo image
Thai word segmentation for visualization of Thai Web sites
Conference paper   Open access

Thai word segmentation for visualization of Thai Web sites

W. Thanadechteemapat and C.C. Fung
2011 International Conference on Machine Learning and Cybernetics, pp.1544-1549
IEEE
International Conference on Machine Learning and Cybernetics, ICMLC 2011 (Guilin, China, 10/07/2011–13/07/2011)
2011
pdf
thai_word_segmentation.pdfDownloadView
Author’s Version Open Access
url
Link to Published Version *Subscription may be requiredView

Abstract

Information overload is a problem in the Information Age and Information visualization is an approach to provide an overview of the content of a web site. Tag cloud is one of the ways to represent information as an image of a group of words. However, there are limitations on tag cloud generation, and one of them is due to the characteristics for the language. In order to extract tags or words for tag cloud, word segmentation is required. This paper proposes a Thai word segmentation approach for the visualization of Thai Web sites. The proposed Thai word segmentation technique is based on the longest matching technique together with a refined corpus. The results of Thai word segmentation are compatible with the results from previous BEST's contests in Thailand.

Details

Metrics

455 File views/ downloads
117 Record Views
Logo image