Journal article
Multi-Task learning for acoustic event detection using event and frame position information
IEEE Transactions on Multimedia, Vol.22(3), pp.569-578
2020
Abstract
Acoustic event detection deals with the acoustic signals to determine the sound type and to estimate the audio event boundaries. Multi-label classification based approaches are commonly used to detect the frame wise event types with a median filter applied to determine the happening acoustic events. However, the multi-label classifiers are trained only on the acoustic event types ignoring the frame position within the audio events. To deal with this, this paper proposes to construct a joint learning based multi-task system. The first task performs the acoustic event type detection and the second task is to predict the frame position information. By sharing representations between the two tasks, we can enable the acoustic models to generalize better than the original classifier by averaging respective noise patterns to be implicitly regularized. Experimental results on the monophonic UPC-TALP and the polyphonic TUT Sound Event datasets demonstrate the superior performance of the joint learning method by achieving lower error rate and higher F-score compared to the baseline AED system.
Details
- Title
- Multi-Task learning for acoustic event detection using event and frame position information
- Authors/Creators
- X. Xia (Author/Creator) - The University of Western AustraliaR. Togneri (Author/Creator) - The University of Western AustraliaF. Sohel (Author/Creator) - Murdoch UniversityY. Zhao (Author/Creator) - The University of Western AustraliaD. Huang (Author/Creator) - The University of Western Australia
- Publication Details
- IEEE Transactions on Multimedia, Vol.22(3), pp.569-578
- Publisher
- IEEE
- Identifiers
- 991005541658407891
- Copyright
- © 2020 IEEE.
- Murdoch Affiliation
- College of Science, Health, Engineering and Education
- Language
- English
- Resource Type
- Journal article
Metrics
63 Record Views
InCites Highlights
These are selected metrics from InCites Benchmarking & Analytics tool, related to this output
- Collaboration types
- Domestic collaboration
- Citation topics
- 4 Electrical Engineering, Electronics & Computer Science
- 4.174 Digital Signal Processing
- 4.174.152 Speech Recognition
- Web Of Science research areas
- Computer Science, Information Systems
- Computer Science, Software Engineering
- Telecommunications
- ESI research areas
- Computer Science