神经网络训练处理器浮点算法的优化结构-2022年

ID：37305

阅读量：0

大小：3.52 MB

页数：16页

时间：2023-03-03

金币：10

上传者：战必胜



 

Citation: Junaid, M.; Arslan, S.; Lee,

T.; Kim, H. Optimal Architecture of

Floating-Point Arithmetic for Neural

Network Training Processors. Sensors

2022, 22, 1230. https://doi.org/

10.3390/s22031230

Academic Editors: Yangquan Chen,

Subhas Mukhopadhyay,

Nunzio Cennamo, M. Jamal Deen,

Junseop Lee and Simone Morais

Received: 24 December 2021

Accepted: 3 February 2022

Published: 6 February 2022

Publisher’s Note: MDPI stays neutral

with regard to jurisdictional claims in

published maps and institutional afﬁl-

iations.

Licensee MDPI, Basel, Switzerland.

This article is an open access article

distributed under the terms and

conditions of the Creative Commons

Attribution (CC BY) license (https://

creativecommons.org/licenses/by/

4.0/).

sensors

Article

Optimal Architecture of Floating-Point Arithmetic for Neural

Network Training Processors

Muhammad Junaid

, Saad Arslan

, TaeGeon Lee

and HyungWon Kim

Department of Electronics, College of Electrical and Computer Engineering, Chungbuk National University,

Cheongju 28644, Korea; junaid@chungbuk.ac.kr (M.J.); tglee2@chungbuk.ac.kr (T.L.)

Department of Electrical and Computer Engineering, COMSATS University Islamabad, Park Road, Tarlai

Kalan, Islamabad 45550, Pakistan; saad.arslan@comsats.edu.pk

* Correspondence: hwkim@chungbuk.ac.kr

Abstract:

The convergence of artiﬁcial intelligence (AI) is one of the critical technologies in the recent

fourth industrial revolution. The AIoT (Artiﬁcial Intelligence Internet of Things) is expected to be a

solution that aids rapid and secure data processing. While the success of AIoT demanded low-power

neural network processors, most of the recent research has been focused on accelerator designs

only for inference. The growing interest in self-supervised and semi-supervised learning now calls

for processors ofﬂoading the training process in addition to the inference process. Incorporating

training with high accuracy goals requires the use of ﬂoating-point operators. The higher precision

ﬂoating-point arithmetic architectures in neural networks tend to consume a large area and energy.

Consequently, an energy-efﬁcient/compact accelerator is required. The proposed architecture incor-

porates training in 32 bits, 24 bits, 16 bits, and mixed precisions to ﬁnd the optimal ﬂoating-point

format for low power and smaller-sized edge device. The proposed accelerator engines have been

veriﬁed on FPGA for both inference and training of the MNIST image dataset. The combination

of 24-bit custom FP format with 16-bit Brain FP has achieved an accuracy of more than 93%. ASIC

implementation of this optimized mixed-precision accelerator using TSMC 65nm reveals an active

area of 1.036

1.036 mm

and energy consumption of 4.445

J per training of one image. Compared

with 32-bit architecture, the size and the energy are reduced by 4.7 and 3.91 times, respectively. There-

fore, the CNN structure using ﬂoating-point numbers with an optimized data path will signiﬁcantly

contribute to developing the AIoT ﬁeld that requires a small area, low energy, and high accuracy.

Keywords: ﬂoating-points; IEEE 754; convolutional neural network (CNN); MNIST dataset

1. Introduction

The Internet of Things (IoT) is a core technology leading the fourth industrial revolu-

tion through the convergence and integration of various advanced technologies. Recently,

the convergence of artiﬁcial intelligence (AI) is expected to be a solution that helps data

processing in IoT quickly and safely. The development of AIoT (Artiﬁcial Intelligence

Internet of Things), a combination of AI and IoT, is expected to improve and broaden IoT

products’ performance [1–3].

AIoT is the latest research topic among AI semiconductors [

]. Before the AIoT

topic, a wide range of research has been conducted in implementing AI, that is, a neural

network that mimics human neurons. As research on AIoT is advancing, the challenges

of resource-constrained IoT devices are also emerging. A survey on such challenges is

being conducted by [

], where the potential solutions for challenges in communication

overhead, convergence guarantee and energy reduction are summarized. Most of the

studies on accelerators of the neural network have focused on the architecture and circuit

structure in the forward direction that determines the accuracy of input data, such as

image data [

]. However, to mimic the neural network of humans or animals as much as

Sensors 2022, 22, 1230. https://doi.org/10.3390/s22031230 https://www.mdpi.com/journal/sensors

资源描述：

当前文档最多预览五页，下载文档查看全文

侵权申诉



1 1 2 3 4 5 / 16



此文档下载收益归作者所有

当前文档最多预览五页，下载文档查看全文

版权提示

温馨提示：
1. 部分包含数学公式或PPT动画的文件，查看预览时可能会显示错乱或异常，文件下载后无此问题，请放心下载。
2. 本文档由用户上传，版权归属用户，天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容，确认文档内容符合您的需求后进行下载，若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误，付费完成后未能成功下载的用户请联系客服处理。

大家都在看

近期热门

神经网络训练处理器浮点算法的优化结构-2022年

最近更新

大家都在看

相关文章

相关标签