Logo image
Text to image synthesis for improved image captioning
Journal article   Open access   Peer reviewed

Text to image synthesis for improved image captioning

Md.Z. Hossain, F. Sohel, M.F. Shiratuddin, H. Laga and M. Bennamoun
IEEE Access, Vol.9, pp.64918-64928
2021
pdf
improved image captioning.pdfDownloadView
Published (Version of Record) Open Access
url
Free to Read *No subscription requiredView

Abstract

Generating textual descriptions of images has been an important topic in computer vision and natural language processing. A number of techniques based on deep learning have been proposed on this topic. These techniques use human-annotated images for training and testing the models. These models require a large number of training data to perform at their full potential. Collecting human generated images with associative captions is expensive and time-consuming. In this paper, we propose an image captioning method that uses both real and synthetic data for training and testing the model. We use a Generative Adversarial Network (GAN) based text to image generator to generate synthetic images. We use an attention-based image captioning method trained on both real and synthetic images to generate the captions. We demonstrate the results of our models using both qualitative and quantitative analysis on popularly used evaluation metrics. We show that our experimental results achieve two fold benefits of our proposed work: i) it demonstrates the effectiveness of image captioning for synthetic images, and ii) it further improves the quality of the generated captions for real images, understandably because we use additional images for training.

Details

UN Sustainable Development Goals (SDGs)

This output has contributed to the advancement of the following goals:

#3 Good Health and Well-Being

Source: InCites

Metrics

597 File views/ downloads
268 Record Views

InCites Highlights

These are selected metrics from InCites Benchmarking & Analytics tool, related to this output

Collaboration types
Domestic collaboration
Citation topics
4 Electrical Engineering, Electronics & Computer Science
4.17 Computer Vision & Graphics
4.17.128 Deep Visual Recognition
Web Of Science research areas
Computer Science, Information Systems
Engineering, Electrical & Electronic
Telecommunications
ESI research areas
Engineering
Logo image