Article
The Whale Optimization Algorithm Approach for Deep
Neural Networks
Andrzej Brodzicki , Michał Piekarski * and Joanna Jaworek-Korjakowska
Citation: Brodzicki, A.; Piekarski, M.;
Jaworek-Korjakowska, J. The Whale
Optimization Algorithm Approach
for Deep Neural Networks. Sensors
2021, 21, 8003. https://doi.org/
10.3390/s21238003
Academic Editor: Biswanath Samanta
Received: 7 November 2021
Accepted: 28 November 2021
Published: 30 November 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Department of Automatic Control and Robotics, AGH University of Science and Technology,
30-059 Cracow, Poland; brodzicki@agh.edu.pl (A.B.); jaworek@agh.edu.pl (J.J.-K.)
* Correspondence: piekarski@agh.edu.pl; Tel.: +48-12-617-5065
Abstract:
One of the biggest challenge in the field of deep learning is the parameter selection and
optimization process. In recent years different algorithms have been proposed including bio-inspired
solutions to solve this problem, however, there are many challenges including local minima, saddle
points, and vanishing gradients. In this paper, we introduce the Whale Optimisation Algorithm
(WOA) based on the swarm foraging behavior of humpback whales to optimise neural network
hyperparameters. We wish to stress that to the best of our knowledge this is the first attempt that
uses Whale Optimisation Algorithm for the optimisation task of hyperparameters. After a detailed
description of the WOA algorithm we formulate and explain the application in deep learning, present
the implementation, and compare the proposed algorithm with other well-known algorithms includ-
ing widely used Grid and Random Search methods. Additionally, we have implemented a third
dimension feature analysis to the original WOA algorithm to utilize 3D search space (3D-WOA). Simu-
lations show that the proposed algorithm can be successfully used for hyperparameters optimization,
achieving accuracy of 89.85% and 80.60% for Fashion MNIST and Reuters datasets, respectively.
Keywords:
whale optimization algorithm; optimization; deep learning; neural networks;
hyperparameters
1. Introduction
Deep learning is currently one of the most popular and rapidly developing section of
artificial intelligence and is mostly based on advanced and sophisticated neural network
architectures which are widely used for tasks including image segmentation, classification,
signal analysis, data investigation and modelling [
1
,
2
]. One of the most challenging
parts while deploying deep neural network architecture is the training process which is
responsible for achieving the highest score while there is a certain inefficiency due to very
long training time required. Obtaining the most accurate deep neural network (DNN)
within a reasonable run-time is still a huge challenge. Furthermore, training the network
requires setting a few hyperparameters such as number of epochs, batch size, learning rate
or optimizer which generates another non-trivial optimisation problem, as it is basically an
optimisation of an optimisation.
Meta-heuristic algorithms such as artificial bee colony, particle swarm optimization,
genetic algorithm and differential evolution have a great potential for optimising both
network architectures and training parameters [
3
]. They have already been applied in
many fields where finding optimal solution was beneficial, like power systems, applied
mathematics, IoT, cryptography, cloud computing as well as automatics (e.g., tuning con-
trollers) [
4
]. Therefore, we decided to utilise this approach in deep learning by optimising
neural network hyperparameters. In particular, the use of artificial intelligence (deep neural
networks in the first place) and thus optimization methods takes place in many sensor
fusion algorithms for object detection and classification in fields like Autonomous Vehicles
(AV) or Unmanned Aerial Vehicles (UAVs) [
5
–
7
]. The key task in such systems is to train
the deep architecture in such a way that from the data from many sensors (such as cameras,
Sensors 2021, 21, 8003. https://doi.org/10.3390/s21238003 https://www.mdpi.com/journal/sensors