Attention-Based image captioning using DenseNet features

M.Z. Hossain; F. Sohel; M.F. Shiratuddin; H. Laga; M. Bennamoun

doi:10.1007/978-3-030-36802-9_13

Back

Attention-Based image captioning using DenseNet features

Conference paper

Peer reviewed

Attention-Based image captioning using DenseNet features

M.Z. Hossain, F. Sohel, M.F. Shiratuddin, H. Laga and M. Bennamoun

Neural Information Processing, Vol.1143

26th International Conference, ICONIP 2019 (Sydney, NSW, 12/12/2019–15/12/2019)

2019

DOI: https://doi.org/10.1007/978-3-030-36802-9_13

Files and links (1)

url

Link to Published Version *Subscription may be requiredView

Abstract

We present an attention-based image captioning method using DenseNet features. Conventional image captioning methods depend on visual information of the whole scene to generate image captions. Such a mechanism often fails to get the information of salient objects and cannot generate semantically correct captions. We consider an attention mechanism that can focus on relevant parts of the image to generate fine-grained description of that image. We use image features from DenseNet. We conduct our experiments on the MSCOCO dataset. Our proposed method achieved 53.6, 39.8, and 29.5 on BLEU-2, 3, and 4 metrics, respectively, which are superior to the state-of-the-art methods.

Details

Title: Attention-Based image captioning using DenseNet features
Authors/Creators: M.Z. Hossain (Author/Creator) - Murdoch University
F. Sohel (Author/Creator) - Murdoch University
M.F. Shiratuddin (Author/Creator) - Murdoch University
H. Laga (Author/Creator) - Murdoch University
M. Bennamoun (Author/Creator) - The University of Western Australia
Publication Details: Neural Information Processing, Vol.1143
Conference: 26th International Conference, ICONIP 2019 (Sydney, NSW, 12/12/2019–15/12/2019)
Identifiers: 991005544524107891
Murdoch Affiliation: Information Technology, Mathematics and Statistics
Language: English
Resource Type: Conference paper

Metrics

113 Record Views