Book chapter
R U :-) or :-( ? Character- vs. Word-Gram Feature Selection for Sentiment Classification of OSN Corpora
Research and Development in Intelligent Systems XXIX, pp.207-212
Springer Verlag
2012
Abstract
Binary sentiment classification, or sentiment analysis, is the task of computing the sentiment of a document, i.e. whether it contains broadly positive or negative opinions. The topic is well-studied, and the intuitive approach of using words as classification features is the basis of most techniques documented in the literature. The alternative character n-gram language model has been applied successfully to a range of NLP tasks, but its effectiveness at sentiment classification seems to be under-investigated, and results are mixed. We present an investigation of the application of the character n-gram model to text classification of corpora from online social networks, the first such documented study, where text is known to be rich in so-called unnatural language, also introducing a novel corpus of Facebook photo comments. Despite hoping that the flexibility of the character n-gram approach would be well-suited to unnatural language phenomenon, we find little improvement over the baseline algorithms employing the word n-gram language model.
Details
- Title
- R U :-) or :-( ? Character- vs. Word-Gram Feature Selection for Sentiment Classification of OSN Corpora
- Authors/Creators
- B. Blamey (Author/Creator) - Cardiff Metropolitan UniversityT. Crick (Author/Creator) - Cardiff Metropolitan UniversityG. Oatley (Author/Creator) - Cardiff Metropolitan University
- Contributors
- M. Bramer (Editor)M. Petridis (Editor)
- Publication Details
- Research and Development in Intelligent Systems XXIX, pp.207-212
- Publisher
- Springer Verlag
- Identifiers
- 991005541566507891
- Copyright
- © 2012 Springer-Verlag London
- Murdoch Affiliation
- Murdoch University
- Language
- English
- Resource Type
- Book chapter
Metrics
33 Record Views