Redefining Vision Tasks: The Power of Transformers in Classification, Detection, and Segmentation

Sushma Hans; Pallavi Ranjan; Salih Ismail

doi:10.1007/978-3-031-91340-2_4

Back

Redefining Vision Tasks: The Power of Transformers in Classification, Detection, and Segmentation

Conference proceeding

Peer reviewed

Redefining Vision Tasks: The Power of Transformers in Classification, Detection, and Segmentation

Sushma Hans, Pallavi Ranjan and Salih Ismail

Artificial Intelligence and Speech Technology (AIST 2024), pp.42-53

Communications in Computer and Information Science

6th International Conference on Artificial Intelligence and Speech Technology (AIST2024) (Delhi, India, 13/11/2024–14/11/2024)

2025

DOI: https://doi.org/10.1007/978-3-031-91340-2_4

Abstract

Classification

Detection

Review

Segmentation

Survey

Transformers

Deep learning architectures have innovated the field of vision transformers with their attainments. Inspired by such significant accomplishments, a multitude of progressive research has recently been done that employs Transformer-based frameworks in computer vision (CV). These models have proved their efficacy in three fundamental vision tasks: image classification, object detection, and segmentation of different sensory data streams. Visual transformers have demonstrated significant performance across various benchmarks in contrast to state-of-the-art convolutional neural networks. In this survey, we have comprehensively reviewed some newly published works according to three central CV tasks. We have assessed and compared all these prevailing transformers using diverse metrics. Additionally, we discuss the open issues and challenges faced and some unmined aspects to strengthen visual transformer architectures.

Details

Title: Redefining Vision Tasks: The Power of Transformers in Classification, Detection, and Segmentation
Authors/Creators: Sushma Hans
Pallavi Ranjan - Murdoch University
Salih Ismail
Publication Details: Artificial Intelligence and Speech Technology (AIST 2024), pp.42-53
Conference: 6th International Conference on Artificial Intelligence and Speech Technology (AIST2024) (Delhi, India, 13/11/2024–14/11/2024)
Series: Communications in Computer and Information Science
Publisher: Springer Nature Switzerland; Cham
Identifiers: 991005788082707891
Murdoch Affiliation: Murdoch University
Language: English
Resource Type: Conference proceeding

Metrics

12 Record Views