Citation: Junaid, M.; Arslan, S.; Lee,
T.; Kim, H. Optimal Architecture of
Floating-Point Arithmetic for Neural
Network Training Processors. Sensors
2022, 22, 1230. https://doi.org/
10.3390/s22031230
Academic Editors: Yangquan Chen,
Subhas Mukhopadhyay,
Nunzio Cennamo, M. Jamal Deen,
Junseop Lee and Simone Morais
Received: 24 December 2021
Accepted: 3 February 2022
Published: 6 February 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Article
Optimal Architecture of Floating-Point Arithmetic for Neural
Network Training Processors
Muhammad Junaid
1
, Saad Arslan
2
, TaeGeon Lee
1
and HyungWon Kim
1,
*
1
Department of Electronics, College of Electrical and Computer Engineering, Chungbuk National University,
Cheongju 28644, Korea; junaid@chungbuk.ac.kr (M.J.); tglee2@chungbuk.ac.kr (T.L.)
2
Department of Electrical and Computer Engineering, COMSATS University Islamabad, Park Road, Tarlai
Kalan, Islamabad 45550, Pakistan; saad.arslan@comsats.edu.pk
* Correspondence: hwkim@chungbuk.ac.kr
Abstract:
The convergence of artificial intelligence (AI) is one of the critical technologies in the recent
fourth industrial revolution. The AIoT (Artificial Intelligence Internet of Things) is expected to be a
solution that aids rapid and secure data processing. While the success of AIoT demanded low-power
neural network processors, most of the recent research has been focused on accelerator designs
only for inference. The growing interest in self-supervised and semi-supervised learning now calls
for processors offloading the training process in addition to the inference process. Incorporating
training with high accuracy goals requires the use of floating-point operators. The higher precision
floating-point arithmetic architectures in neural networks tend to consume a large area and energy.
Consequently, an energy-efficient/compact accelerator is required. The proposed architecture incor-
porates training in 32 bits, 24 bits, 16 bits, and mixed precisions to find the optimal floating-point
format for low power and smaller-sized edge device. The proposed accelerator engines have been
verified on FPGA for both inference and training of the MNIST image dataset. The combination
of 24-bit custom FP format with 16-bit Brain FP has achieved an accuracy of more than 93%. ASIC
implementation of this optimized mixed-precision accelerator using TSMC 65nm reveals an active
area of 1.036
×
1.036 mm
2
and energy consumption of 4.445
µ
J per training of one image. Compared
with 32-bit architecture, the size and the energy are reduced by 4.7 and 3.91 times, respectively. There-
fore, the CNN structure using floating-point numbers with an optimized data path will significantly
contribute to developing the AIoT field that requires a small area, low energy, and high accuracy.
Keywords: floating-points; IEEE 754; convolutional neural network (CNN); MNIST dataset
1. Introduction
The Internet of Things (IoT) is a core technology leading the fourth industrial revolu-
tion through the convergence and integration of various advanced technologies. Recently,
the convergence of artificial intelligence (AI) is expected to be a solution that helps data
processing in IoT quickly and safely. The development of AIoT (Artificial Intelligence
Internet of Things), a combination of AI and IoT, is expected to improve and broaden IoT
products’ performance [1–3].
AIoT is the latest research topic among AI semiconductors [
4
,
5
]. Before the AIoT
topic, a wide range of research has been conducted in implementing AI, that is, a neural
network that mimics human neurons. As research on AIoT is advancing, the challenges
of resource-constrained IoT devices are also emerging. A survey on such challenges is
being conducted by [
6
], where the potential solutions for challenges in communication
overhead, convergence guarantee and energy reduction are summarized. Most of the
studies on accelerators of the neural network have focused on the architecture and circuit
structure in the forward direction that determines the accuracy of input data, such as
image data [
7
,
8
]. However, to mimic the neural network of humans or animals as much as
Sensors 2022, 22, 1230. https://doi.org/10.3390/s22031230 https://www.mdpi.com/journal/sensors