基于模糊逻辑的带奖励函数的机器人钉孔装配任务的深度确定策略梯度

VIP文档

ID:38475

大小:15.11 MB

页数:16页

时间:2023-03-10

金币:10

上传者:战必胜

 
Citation: Wang, Z.; Li, F.; Men, Y.; Fu,
T.; Yang, X.; Song, R. Deep
Deterministic Policy Gradient with
Reward Function Based on Fuzzy
Logic for Robotic Peg-in-Hole
Assembly Tasks. Appl. Sci. 2022, 12,
3181. https://doi.org/10.3390/
app12063181
Academic Editors: Giovanni
Boschetti and João Miguel
da Costa Sousa
Received: 10 February 2022
Accepted: 18 March 2022
Published: 21 March 2022
Publishers Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
applied
sciences
Article
Deep Deterministic Policy Gradient with Reward Function
Based on Fuzzy Logic for Robotic Peg-in-Hole Assembly Tasks
Ziyue Wang
1
, Fengming Li
2,3,
*
,†
, Yu Men
3
, Tianyu Fu
3
, Xuting Yang
3
and Rui Song
3
1
College of Science, Guilin University of Technology, Guilin 541006, China; wziyins27@163.com
2
School of Information and Electrical Engineering, Shandong Jianzhu University, Jinan 250101, China
3
School of Control Science and Engineering, Shandong University, Jinan 250061, China;
202114785@mail.sdu.edu.cn (Y.M.); futy0807@gmail.com (T.F.); 201914426@mail.sdu.edu.cn (X.Y.);
rsong@sdu.edu.cn (R.S.)
* Correspondence: lifengming@sucro.org or lifengming21@sdjzu.edu.cn; Tel.: +86-186-6016-8885
Current address: School of Control Science and Engineering, Shandong University, Jingshi Road 17923,
Jinan 250061, China.
Abstract: Robot automatic assembly of weak stiffness parts is difficult due to potential deformation
during assembly. The robot manipulation cannot adapt to the dynamic contact changes during the
assembly process. A robot assembly skill learning system is designed by combining the compliance
control and deep reinforcement, which could acquire a better robot assembly strategy. In this paper, a
robot assembly strategy learning method based on variable impedance control is proposed to solve
the robot assembly contact tasks. During the assembly process, the quality evaluation is designed
based on fuzzy logic, and the impedance parameters in the assembly process are studied with a deep
deterministic policy gradient. Finally, the effectiveness of the method is verified using the KUKA iiwa
robot in the weak stiffness peg-in-hole assembly. Experimental results show that the robot obtains
the robot assembly strategy with variable compliant in the process of weak stiffness peg-in-hole
assembly. Compared with the previous methods, the assembly success rate of the proposed method
reaches 100%.
Keywords: robot assembly; deep reinforcement learning; fuzzy reward; compliant control
1. Introduction
The robot operating contact environment is changeable and unpredictable. It is a
challenge that the robot could quickly perform new tasks and precisely control the contact
force in different environments. High-precision assembly is a typical contact operation [
1
,
2
],
and the assembly process needs to overcome the environmental model and controller errors.
The peg-in-hole assembly process is usually divided into the search phase and the insertion
phase [
3
], which is visual and tactile. In the insertion phase, the center axis of the peg-in-
hole inserts into the bottom. When the axis deviation or force/torque is not appropriate,
it can cause card resistance or wedge tightening. Due to the deformation error, friction
and robot positioning error between assembly objects, it is difficult to establish an accurate
physical model and find the optimal assembly strategy according to the model analysis.
Robot assembly control strategies could be designed with forces and torques in the
robot assembly based on mathematical models. Compared to the position feedback con-
troller with high gain, impedance control ensures that the robot and environment are
fully controllable. A natural mass-damping-spring relationship is maintained between the
contact force and the position offset, and its force control characteristics depend on inertia,
stiffness, and damping parameters [
4
]. The traditional method of adjusting parameters
manually adjusts the control parameters according to the characteristics of the task. For the
assembly of such complex tasks, it is difficult to set the impedance control method of fixed
parameters to achieve the target task. If the parameters of impedance control could be
Appl. Sci. 2022, 12, 3181. https://doi.org/10.3390/app12063181 https://www.mdpi.com/journal/applsci
资源描述:

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
关闭