Citation: Xi, X.; Wang, J.; Li, F.; Li, D.
IRSDet: Infrared Small-Object
Detection Network Based on
Sparse-Skip Connection and Guide
Maps. Electronics 2022, 11, 2154.
https://doi.org/10.3390/
electronics11142154
Academic Editor: Dah-Jye Lee
Received: 9 June 2022
Accepted: 8 July 2022
Published: 9 July 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Article
IRSDet: Infrared Small-Object Detection Network Based on
Sparse-Skip Connection and Guide Maps
Xiaoli Xi
1,2
, Jinxin Wang
1,2
, Fang Li
1,2
and Dongmei Li
1,2,
*
1
Laboratory of Optoelectronic System, Institute of Semiconductors, Chinese Academy of Sciences,
Beijing 100083, China; xixiaoli@semi.ac.cn (X.X.); wangjx@semi.ac.cn (J.W.); lifang@semi.ac.cn (F.L.)
2
College of Materials Science and Opto-Electronic Technology, University of Chinese Academy of Sciences,
Beijing 100049, China
* Correspondence: lidongmei@semi.ac.cn
Abstract:
Detecting small objects in infrared images remains a challenge because most of them lack
shape and texture. In this study, we proposed an infrared small-object detection method to improve
the capacity for detecting thermal objects in complex scenarios. First, a sparse-skip connection block
is proposed to enhance the response of small infrared objects and suppress the background response.
This block is used to construct the detection model backbone. Second, a region attention module
is designed to emphasize the features of infrared small objects and suppress background regions.
Finally, a batch-averaged biased classification loss function is designed to improve the accuracy
of the detection model. The experimental results show that the proposed small-object detection
framework significantly increases precision, recall, and F1-score, showing that, compared with the
current advanced detection models for small-object detection, the proposed detection framework
has better performance in infrared small-object detection under complex backgrounds. The insights
gained from this study may provide new ideas for infrared small object detection and tracking.
Keywords: infrared image; small object; object detection; SSD
1. Introduction
With the development of infrared image-sensor technology, infrared spectral imaging
technology has provided new information for object-detection tasks [
1
,
2
]. Currently, the
object detection method based on infrared images is one of the best methods for detecting
remote thermal objects because the infrared features of objects are more noticeable than
their visible features [
3
]. In remote detection tasks, most infrared objects are considered
small objects because of fewer pixels, a lower signal-to-clutter ratio (SCR), unclear contours,
and sparse texture features. Because of these characteristics, infrared small-object detection
remains a significant challenge.
Convolutional neural networks (CNNs) provide a broader perspective on object detec-
tion. Compared to traditional methods, CNN-based object detection methods can adaptively
learn object locations and semantic information in sample images, resulting in higher accu-
racy and robustness. Object detection models based on CNN include two- and one-stage
models. The former are not suitable for high real-time detection because of the slow infer-
ence speed that divides positioning and classification into two steps, such as RCNN [
4
]. The
latter, such as YOLO [5] and SSD [6], have a fast inference speed and good accuracy.
Some optimized CNN-based models have a good detection capacity for small objects.
ResNet [
7
], DenseNet [
8
], and ResNext [
9
] propose shortcut connections that can transfer
information by skipping one or more layers to address the degradation problem. This is
helpful in reducing the feature loss of small objects during information transmission. In
DetNet [
10
], downsampling blocks in deep layers are eliminated to preserve the resolution
of high-level feature maps, which can improve the positioning accuracy of small objects.
DSSD [
11
], RSSD [
12
], and FSSD [
13
] propose specific multiscale feature fusion methods to
Electronics 2022, 11, 2154. https://doi.org/10.3390/electronics11142154 https://www.mdpi.com/journal/electronics