用于卷积神经网络加速和图像信号处理的具有流水线工作流的混合视觉处理单元-2021年

ID：37330

阅读量：0

大小：2.12 MB

页数：25页

时间：2023-03-03

金币：10

上传者：战必胜

electronics

Article

A Hybrid Vision Processing Unit with a Pipelined Workﬂow for

Convolutional Neural Network Accelerating and Image

Signal Processing

Peng Liu

and Yan Song



 

Citation: Liu, P.; Song, Y. A Hybrid

Vision Processing Unit with a

Pipelined Workﬂow for

Convolutional Neural Network

Accelerating and Image Signal

Processing. Electronics 2021, 10, 2989.

https://doi.org/10.3390/

electronics10232989

Academic Editors: Nunzio Cennamo,

YangQuan Chen,

Subhas Mukhopadhyay and

Simone Morais

Received: 11 November 2021

Accepted: 30 November 2021

Published: 1 December 2021

Publisher’s Note: MDPI stays neutral

with regard to jurisdictional claims in

published maps and institutional afﬁl-

iations.

Licensee MDPI, Basel, Switzerland.

This article is an open access article

distributed under the terms and

conditions of the Creative Commons

Attribution (CC BY) license (https://

creativecommons.org/licenses/by/

4.0/).

School of Microelectronics, Tianjin University, Tianjin 300072, China; zationlue@tju.edu.cn

Institute of Marine Science and Technology, Shandong University, Qingdao 266237, China

* Correspondence: ysong@sdu.edu.cn

Abstract:

Vision processing chips have been widely used in image processing and recognition tasks.

They are conventionally designed based on the image signal processing (ISP) units directly connected

with the sensors. In recent years, convolutional neural networks (CNNs) have become the dominant

tools for many state-of-the-art vision processing tasks. However, CNNs cannot be processed by a

conventional vision processing unit (VPU) with a high speed. On the other side, the CNN processing

units cannot process the RAW images from the sensors directly and an ISP unit is required. This

makes a vision system inefﬁcient with a lot of data transmission and redundant hardware resources.

Additionally, many CNN processing units suffer from a low ﬂexibility for various CNN operations.

To solve this problem, this paper proposed an efﬁcient vision processing unit based on a hybrid

processing elements array for both CNN accelerating and ISP. Resources are highly shared in this

VPU, and a pipelined workﬂow is introduced to accelerate the vision tasks. We implement the

proposed VPU on the Field-Programmable Gate Array (FPGA) platform and various vision tasks are

tested on it. The results show that this VPU achieves a high efﬁciency for both CNN processing and

ISP and shows a signiﬁcant reduction in energy consumption for vision tasks consisting of CNNs

and ISP. For various CNN tasks, it maintains an average multiply accumulator utilization of over

94% and achieves a performance of 163.2 GOPS with a frequency of 200 MHz.

Keywords:

vision processing unit; neural network processing unit; image signal processing unit;

image recognition

1. Introduction

Vision processing chips have proven to be highly efﬁcient for computer vision tasks

by integrating the image sensor and vision processing unit (VPU) together in the recent

works [

–

]. Most of them utilize a Single-Instruction-Multiple-Data (SIMD) array of

processing elements (PE) connected with the sensor directly. Consequently, they can

eliminate the pixels transmission bottleneck and execute vision tasks in a parallel way.

The vision tasks mainly consist of image signal processing (ISP) algorithms and recog-

nition algorithms [

], as illustrated in Figure 1. All the algorithms are performed on the

PE array in the VPU. On the conventional vision chips, recognition algorithms includ-

ing Speed-up Robust Features (SURF) [

], Scale-Invariant Feature Transform (SIFT) [

]

and Features from Accelerated Segment Test (FAST) [

] are usually applied. Recently,

the artiﬁcial neural networks have shown great performance on the computer vision

tasks

[7–10]. Therefore, works [1,11]

proposed the VPUs that try to exploit the conven-

tional PE array for self-organizing map (SOM) neural networks. However, these conven-

tional architectures are not efﬁcient for modern neural networks. They do not contain

the multiply accumulators (MAC), which are essential to accelerate the neural network

processing

[12–14]

. For instance, the convolutional neural networks (CNNs) are very im-

Electronics 2021, 10, 2989. https://doi.org/10.3390/electronics10232989 https://www.mdpi.com/journal/electronics

资源描述：

当前文档最多预览五页，下载文档查看全文

侵权申诉



1 1 2 3 4 5 / 25



此文档下载收益归作者所有

当前文档最多预览五页，下载文档查看全文

版权提示

温馨提示：
1. 部分包含数学公式或PPT动画的文件，查看预览时可能会显示错乱或异常，文件下载后无此问题，请放心下载。
2. 本文档由用户上传，版权归属用户，天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容，确认文档内容符合您的需求后进行下载，若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误，付费完成后未能成功下载的用户请联系客服处理。

大家都在看

近期热门

用于卷积神经网络加速和图像信号处理的具有流水线工作流的混合视觉处理单元-2021年

最近更新

大家都在看

相关文章

相关标签