Logo image
Text dimensionality reduction for document clustering using hybrid memetic feature selection
Journal article   Peer reviewed

Text dimensionality reduction for document clustering using hybrid memetic feature selection

I. Al-Jadir, K.W. Wong, C.C. Fung and H. Xie
Multi-disciplinary Trends in Artificial Intelligence, Vol.10607, pp.281-289
2017
url
Link to Published Version *Subscription may be requiredView

Abstract

In this paper, a document clustering method with a hybrid feature selection method is proposed. The proposed hybrid feature selection method integrates a Genetic-based wrapper method with ranking filter. The method is named Memetic Algorithm-Feature Selection (MA-FS). In this paper, MA-FS is combined with K-means and Spherical K-means (SK-means) clustering methods to perform document clustering. For the purpose of comparison, another unsupervised feature selection method, Feature Selection Genetic Text Clustering (FSGATC), is used. Two real-world criminal report document sets were used along with two popular benchmark datasets which are Reuters and 20newsgroup, were used in the comparisons. F-Micro, F-Macro and Average Distance of Document to Cluster (ADDC) measures were used for evaluation. The test results showed that the MA-FS method has outperformed the FSGATC method. It has also outperformed the results after using the entire feature space (ALL).

Details

Metrics

84 Record Views
Logo image