ETAM: Ensemble transformer with attention modules for detection of small objects

Jiangnan Zhang; Kewen Xia; Zhiyi Huang; Sijie Wang; Romoke Grace Akindele

doi:10.1016/j.eswa.2023.119997

Back

ETAM: Ensemble transformer with attention modules for detection of small objects

Journal article

Peer reviewed

ETAM: Ensemble transformer with attention modules for detection of small objects

Jiangnan Zhang, Kewen Xia, Zhiyi Huang, Sijie Wang and Romoke Grace Akindele

Expert systems with applications, Vol.224, p.119997

15/08/2023

DOI: https://doi.org/10.1016/j.eswa.2023.119997

Abstract

Ensemble learning

Small object detection

Transformer

Visual attention module

Detecting small objects is critical to many submissions, such as automatic drive and lung nodule detection. However, small object detection is challenging with low-resolution features. Therefore, the linchpin of small object detection is to design an effective encoder that can extract subtle features. In this paper, we present a powerful encoder, called Ensemble Transformer with Attention Modules (ETAM) encoder, for abstracting the subtle small object features without sacrificing the capability of larger object detection. In ETAM, a Magnifying Glass (MG) module is proposed to focus on representative features of small objects. Then, the Quadruple Attention (QA) is designed to enrich the small object features with width and height in addition to channel and position. To accommodate both small and large objects, we use ensemble learning in our ETAM encoder, which has two branches. Experimental results show that ETAM significantly improves small object detection based on PASCAL VOC, MS-COCO, VisDrone2019, and LIDC-IDRI. With ETAM, the mAP for small objects is improved up to 91.7% based on the four datasets. •Transformer’s potential for small object detection is demonstrated.•The MG can forecast small objects’ wide positions on shallow features.•The QA extends the attention to two extra dimensions, height and width.•ETAM has two branches to adapt small and larger object detection.

Metrics

2 Record Views

Details

Record Identifier: 9926495349001891
Title: ETAM: Ensemble transformer with attention modules for detection of small objects
Creators: Jiangnan Zhang
Kewen Xia
Zhiyi Huang
Sijie Wang
Romoke Grace Akindele
Publication Details: Expert systems with applications, Vol.224, p.119997
Academic Unit: Computer Science
Publisher: Elsevier Ltd
Date published ; e-published: 15/08/2023
Language: English
Resource Type: Journal article

ETAM: Ensemble transformer with attention modules for detection of small objects

Abstract

Related links

Metrics

Details

Usage Policy