NNEval: Neural network based evaluation metric for image captioning

N. Sharif; L. White; M. Bennamoun; S.A.A. Shah

doi:10.1007/978-3-030-01237-3_3

Back

NNEval: Neural network based evaluation metric for image captioning

Book chapter

Peer reviewed

NNEval: Neural network based evaluation metric for image captioning

N. Sharif, L. White, M. Bennamoun and S.A.A. Shah

Computer Vision – ECCV 2018, Vol.11212, pp.39-55

Springer, Cham

2018

DOI: https://doi.org/10.1007/978-3-030-01237-3_3

Files and links (1)

url

Link to Published Version *Subscription may be requiredView

Abstract

The automatic evaluation of image descriptions is an intricate task, and it is highly important in the development and fine-grained analysis of captioning systems. Existing metrics to automatically evaluate image captioning systems fail to achieve a satisfactory level of correlation with human judgements at the sentence level. Moreover, these metrics, unlike humans, tend to focus on specific aspects of quality, such as the n-gram overlap or the semantic meaning. In this paper, we present the first learning-based metric to evaluate image captions. Our proposed framework enables us to incorporate both lexical and semantic information into a single learned metric. This results in an evaluator that takes into account various linguistic features to assess the caption quality. The experiments we performed to assess the proposed metric, show improvements upon the state of the art in terms of correlation with human judgements and demonstrate its superior robustness to distractions.

Details

Title: NNEval: Neural network based evaluation metric for image captioning
Authors/Creators: N. Sharif (Author/Creator) - The University of Western Australia
L. White (Author/Creator) - The University of Western Australia
M. Bennamoun (Author/Creator) - The University of Western Australia
S.A.A. Shah (Author/Creator) - The University of Western Australia
Contributors: V. Ferrari (Editor)
M. Hebert (Editor)
C. Sminchisescu (Editor)
Y. Weiss (Editor)
Publication Details: Computer Vision – ECCV 2018, Vol.11212, pp.39-55
Publisher: Springer, Cham
Identifiers: 991005542720307891
Murdoch Affiliation: Murdoch University
Language: English
Resource Type: Book chapter
Additional Information: Part of the Lecture Notes in Computer Science book series (LNCS, volume 11212)

UN Sustainable Development Goals (SDGs)

This output has contributed to the advancement of the following goals:

Source: InCites

Metrics

34 Record Views

8 Times Cited - Web of Science