SiamMixer一个轻量级和硬件友好的视觉对象跟踪网络-2022年

ID：37215

阅读量：0

大小：1.61 MB

页数：15页

时间：2023-03-03

金币：10

上传者：战必胜



 

Citation: Cheng, L.; Zheng, X.; Zhao,

M.; Dou, R.; Yu, S.; Wu, N.; Liu, L.

SiamMixer: A Lightweight and

Hardware-Friendly Visual

Object-Tracking Network. Sensors

2022, 22, 1585. https://doi.org/

10.3390/s22041585

Academic Editors: Yangquan Chen,

Subhas Mukhopadhyay, Nunzio

Cennamo, M. Jamal Deen, Junseop

Lee and Simone Morais

Received: 24 January 2022

Accepted: 14 February 2022

Published: 18 February 2022

Publisher’s Note: MDPI stays neutral

with regard to jurisdictional claims in

published maps and institutional afﬁl-

iations.

Licensee MDPI, Basel, Switzerland.

This article is an open access article

distributed under the terms and

conditions of the Creative Commons

Attribution (CC BY) license (https://

creativecommons.org/licenses/by/

4.0/).

sensors

Article

SiamMixer: A Lightweight and Hardware-Friendly Visual

Object-Tracking Network

Li Cheng

1,2

, Xuemin Zheng

1,2

, Mingxin Zhao

1,2

, Runjiang Dou

*, Shuangming Yu

, Nanjian Wu

1,2,3

and Liyuan Liu

State Key Laboratory of Superlattices and Microstructures, Institute of Semiconductors,

Chinese Academy of Sciences, Beijing 100083, China; chengli17@semi.ac.cn (L.C.); zxm16@semi.ac.cn (X.Z.);

zhaomingxin17@semi.ac.cn (M.Z.); yushuangming@semi.ac.cn (S.Y.); nanjian@red.semi.ac.cn (N.W.);

liuly@semi.ac.cn (L.L.)

Center of Materials Science and Optoelectronics Engineering, University of Chinese Academy of Sciences,

Beijing 100049, China

The Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences,

Beijing 100083, China

* Correspondence: dourj@semi.ac.cn

Abstract:

Siamese networks have been extensively studied in recent years. Most of the previous

research focuses on improving accuracy, while merely a few recognize the necessity of reducing

parameter redundancy and computation load. Even less work has been done to optimize the runtime

memory cost when designing networks, making the Siamese-network-based tracker difﬁcult to

deploy on edge devices. In this paper, we present SiamMixer, a lightweight and hardware-friendly

visual object-tracking network. It uses patch-by-patch inference to reduce memory use in shallow

layers, where each small image region is processed individually. It merges and globally encodes

feature maps in deep layers to enhance accuracy. Beneﬁting from these techniques, SiamMixer

demonstrates a comparable accuracy to other large trackers with only 286 kB parameters and 196 kB

extra memory use for feature maps. Additionally, we verify the impact of various activation functions

and replace all activation functions with ReLU in SiamMixer. This reduces the cost when deploying

on mobile devices.

Keywords:

visual object-tracking; deep features; siamese network; lightweight neural network; edge

computing devices

1. Introduction

Visual object-tracking is a fundamental problem in computer vision, whose goal is

to locate the target in subsequent video frames based on its position in the initial frame.

Visual object-tracking plays an essential role in many ﬁelds such as surveillance, machine

vision, and human–computer interaction [1].

Discriminative Correlation Filters (DCFs) and Siamese networks are the dominant

tracking algorithm models presently. DCF emerged much earlier than Siamese network

trackers. It uses cyclic moving training samples to achieve dense sampling and uses a

fast Fourier transform to accelerate the learning and applying of the correlation ﬁlters. It

has the advantage of high computational efﬁciency. However, the design of the feature

descriptors requires expert intervention, and the circular sampling produces artifacts at the

search boundary that can affect the tracking results. The emergence of Siamese networks

provides an end-to-end solution and eliminates the tediousness of manually designing

feature descriptors while exhibiting decent tracking performance.

The Siamese network tracker treats visual target tracking as a similarity learning

problem. The neural network is used to learn the similarity descriptor function between

the target and the search region. The Siamese network consists of two branches. The input

Sensors 2022, 22, 1585. https://doi.org/10.3390/s22041585 https://www.mdpi.com/journal/sensors

资源描述：

当前文档最多预览五页，下载文档查看全文

侵权申诉



1 1 2 3 4 5 / 15



此文档下载收益归作者所有

当前文档最多预览五页，下载文档查看全文

版权提示

温馨提示：
1. 部分包含数学公式或PPT动画的文件，查看预览时可能会显示错乱或异常，文件下载后无此问题，请放心下载。
2. 本文档由用户上传，版权归属用户，天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容，确认文档内容符合您的需求后进行下载，若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误，付费完成后未能成功下载的用户请联系客服处理。

大家都在看

近期热门

SiamMixer一个轻量级和硬件友好的视觉对象跟踪网络-2022年

最近更新

大家都在看

相关文章

相关标签