Citation: Zhang, M.; Gao, F.; Yang,
W.; Zhang, H. Wildlife Object
Detection Method Applying
Segmentation Gradient Flow and
Feature Dimensionality Reduction.
Electronics 2023, 12, 377. https://
doi.org/10.3390/electronics12020377
Academic Editor: Silvia
Liberata Ullo
Received: 26 November 2022
Revised: 5 January 2023
Accepted: 6 January 2023
Published: 11 January 2023
Copyright: © 2023 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Article
Wildlife Object Detection Method Applying Segmentation
Gradient Flow and Feature Dimensionality Reduction
Mingyu Zhang, Fei Gao * , Wuping Yang and Haoran Zhang
School of Science, Wuhan University of Technology, Wuhan 430070, China
* Correspondence: gaof@whut.edu.cn; Tel.: +86-18971097697
Abstract:
This work suggests an enhanced natural environment animal detection algorithm based on
YOLOv5s to address the issues of low detection accuracy and sluggish detection speed when automat-
ically detecting and classifying large animals in natural environments. To increase the detection speed
of the model, the algorithm first enhances the SPP by switching the parallel connection of the original
maximum pooling layer for a series connection. It then expands the model’s receptive field using
the dataset from this paper to enhance the feature fusion network by stacking the feature pyramid
network structure as a whole; secondly, it introduces the GSConv module, which combines standard
convolution, depth-separable convolution, and hybrid channels to reduce network parameters and
computation, making the model lightweight and easier to deploy to endpoints. At the same time, GS
bottleneck is used to replace the Bottleneck module in C3, which divides the input feature map into
two channels and assigns different weights to them. The two channels are combined and connected
in accordance with the number of channels, which enhances the model’s ability to express non-linear
functions and resolves the gradient disappearance issue. Wildlife images are obtained from the
OpenImages public dataset and real-life shots. The experimental results show that the improved
YOLOv5s algorithm proposed in this paper reduces the computational effort of the model compared
to the original algorithm, while also providing an improvement in both detection accuracy and speed,
and it can be well applied to the real-time detection of animals in natural environments.
Keywords:
animal recognition; feature fusion networks; YOLOv5s; segmentation gradient flow; GSConv
1. Introduction
Target identification and recognition of animals have grown in importance as computer
vision technology has progressed. However, conventional approaches to these problems
currently do not produce satisfying outcomes, and deep learning has emerged as a break-
through technology in this area. In recent centuries, the expansion of human society into
the natural environment for development has resulted in the loss of wildlife habitats, and
the environment has been severely damaged by the advent of the industrial age and rapid
population growth. Some fauna have already become extinct as a result of these. Therefore, a
novel method for wildlife conservation and ecological study is provided by the application of
target detection algorithms in deep learning to detect and identify animals [1].
Convolutional neural networks (CNN) are a class of feedforward neural networks
(FNN) with convolutional computation and a deep structure, which is one of the represen-
tative algorithms of deep learning [
2
]. With the development of artificial intelligence and
deep learning, the application of convolutional neural networks to wildlife detection and
identification is of great significance for wildlife conservation as it extracts surrounding
target features in real time. Among the algorithms for target feature extraction, the faster
region-based convolutional neural network (Faster R-CNN) algorithm [
3
], single shot multi-
box detector (SSD) algorithm [
4
,
5
], and the you only look once (YOLO) algorithm [
6
–
8
]
have successfully applied deep learning to target extraction and target detection; the YOLO
algorithm is trained and detected in a separate network, and regression and classification
Electronics 2023, 12, 377. https://doi.org/10.3390/electronics12020377 https://www.mdpi.com/journal/electronics