Citation: Shan, D.; Xu, Y.; Zhang, P.;
Wang, X.; He, D.; Zhang, C.; Zhou,
M.; Yu, G. DPSSD: Dual-Path
Single-Shot Detector. Sensors 2022, 22,
4616. https://doi.org/10.3390/
s22124616
Academic Editor: Marcin Kowalski
Received: 9 May 2022
Accepted: 13 June 2022
Published: 18 June 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Article
DPSSD: Dual-Path Single-Shot Detector
Dongri Shan
1,
* , Yalu Xu
1
, Peng Zhang
2
, Xiaofang Wang
2
, Dongmei He
2
, Chenglong Zhang
1
, Maohui Zhou
1
and Guoqi Yu
1
1
School of Mechanical Engineering, Qilu University of Technology (Shandong Academy of Sciences),
Jinan 250300, China; 1043119034@stu.qlu.edu.cn (Y.X.); 1043119003@stu.qlu.edu.cn (C.Z.);
1043119049@stu.qlu.edu.cn (M.Z.); 1043119037@stu.qlu.edu.cn (G.Y.)
2
School of Information and Automation Engineering, Qilu University of Technology (Shandong Academy of
Sciences), Jinan 250300, China; zp@qlu.edu.cn (P.Z.); wxf2012@stu.xjtu.edu.cn (X.W.); hedm@sdas.org (D.H.)
* Correspondence: shandongri@qlu.edu.cn; Tel.: +86-138-6406-5008
Abstract:
Object detection is one of the most important and challenging branches of computer
vision. It has been widely used in people’s lives, such as for surveillance security and autonomous
driving. We propose a novel dual-path multi-scale object detection paradigm in order to extract
more abundant feature information for the object detection task and optimize the multi-scale object
detection problem, and based on this, we design a single-stage general object detection algorithm
called Dual-Path Single-Shot Detector (DPSSD). The dual path ensures that shallow features, i.e.,
residual path and concatenation path, can be more easily utilized to improve detection accuracy.
Our improved dual-path network is more adaptable to multi-scale object detection tasks, and we
combine it with the feature fusion module to generate a multi-scale feature learning paradigm called
the “Dual-Path Feature Pyramid”. We trained the models on PASCAL VOC datasets and COCO
datasets with 320 pixels and 512 pixels input, respectively, and performed inference experiments to
validate the structures in the neural network. The experimental results show that our algorithm has
an advantage over anchor-based single-stage object detection algorithms and achieves an advanced
level in average accuracy. Researchers can replicate the reported results of this paper.
Keywords: convolution neural networks; object detection; single-stage; multi-scale
1. Introduction
After the success of deep convolution neural networks (DCNN) [
1
] in the field of image
classification, the object detection algorithm also introduces deep-learning technology and
has achieved significant progress [
2
,
3
]. These new algorithms based on deep learning
are much better than the traditional algorithm because the feature of the manual design
is replaced with the feature representation computed via convolution neural networks.
However, multi-scale feature learning is a critical problem of the detection algorithms
based on deep learning. To optimize this problem and improve the detection effect of
the single-stage multi-scale detector based on the anchor box, we conducted a relevant
literature search and experiments.
In general, the objects are placed in a complex environment and have a large variance
in scale; for example, in applications such as pedestrian detection, face detection and
autonomous driving, the algorithm has to be robust to changes in the scale of the object [
4
].
It is critical to train a robust and discriminate feature to obtain good detection performance.
There are four main paradigms to address the multi-scale feature learning problem: the
image pyramid, the prediction pyramid, integrated features and the feature pyramid
(Figure 1). SNIP [
5
] uses the image pyramid to solve the multi-scale problem, where
each layer is responsible for a certain range of scales (Figure 1a). In this way, the same
sample needs to be converted into different scales and repeatedly input to the network for
training. This results in many redundant calculations. By fusing the shallow features rich
Sensors 2022, 22, 4616. https://doi.org/10.3390/s22124616 https://www.mdpi.com/journal/sensors