Deep Boltzmann machines for i-Vector based audio-visual person identification

M. Alam; M. Bennamoun; R. Togneri; F. Sohel

doi:10.1007/978-3-319-29451-3_50

Back

Deep Boltzmann machines for i-Vector based audio-visual person identification

Journal article

Peer reviewed

Deep Boltzmann machines for i-Vector based audio-visual person identification

M. Alam, M. Bennamoun, R. Togneri and F. Sohel

Lecture Notes in Computer Science, Vol.9431, pp.631-641

2015

DOI: https://doi.org/10.1007/978-3-319-29451-3_50

Files and links (2)

url

Link to Published Version *Subscription may be requiredView

url

Conference WebsiteView

Abstract

We propose an approach using DBM-DNNs for i-vector based audio-visual person identification. The unsupervised training of two Deep Boltzmann Machines DBMspeech and DBMface is performed using unlabeled audio and visual data from a set of background subjects. The DBMs are then used to initialize two corresponding DNNs for classification, referred to as the DBM-DNNspeech and DBM-DNNface in this paper. The DBM-DNNs are discriminatively fine-tuned using the back-propagation on a set of training data and evaluated on a set of test data from the target subjects. We compared their performance with the cosine distance (cosDist) and the state-of-the-art DBN-DNN classifier. We also tested three different configurations of the DBM-DNNs. We show that DBM-DNNs with two hidden layers and 800 units in each hidden layer achieved best identification performance for 400 dimensional i-vectors as input. Our experiments were carried out on the challenging MOBIO dataset.

Details

Title: Deep Boltzmann machines for i-Vector based audio-visual person identification
Authors/Creators: M. Alam (Author/Creator)
M. Bennamoun (Author/Creator)
R. Togneri (Author/Creator)
F. Sohel (Author/Creator)
Publication Details: Lecture Notes in Computer Science, Vol.9431, pp.631-641
Publisher: Springer Verlag
Number of pages: 11
Identifiers: 991005541534007891
Murdoch Affiliation: School of Engineering and Information Technology
Language: English
Resource Type: Journal article
Additional Information: Book Title: Image and Video Technology: 7th Pacific Rim Symposium on Image and Video Technology (PSIVT) 2015 Auckland, New Zealand 23 - 27 November 2015 Revised Selected Papers

Metrics

51 Record Views

2 Times Cited - Web of Science