PSTO学习四足机器人的节能运动

ID：38526

阅读量：0

大小：8.50 MB

页数：13页

时间：2023-03-11

金币：2

上传者：战必胜



 

Citation: Zhu, W.; Rosendo, A. PSTO:

Learning Energy-Efﬁcient

Locomotion for Quadruped Robots.

Machines 2022, 10, 185. https://

doi.org/10.3390/machines10030185

Academic Editors: Dan Zhang and

Marco Ceccarelli

Received: 6 January 2022

Accepted: 18 February 2022

Published: 4 March 2022

Publisher’s Note: MDPI stays neutral

with regard to jurisdictional claims in

published maps and institutional afﬁl-

iations.

Licensee MDPI, Basel, Switzerland.

This article is an open access article

distributed under the terms and

conditions of the Creative Commons

Attribution (CC BY) license (https://

creativecommons.org/licenses/by/

4.0/).

machines

Article

PSTO: Learning Energy-Efﬁcient Locomotion for

Quadruped Robots

Wangshu Zhu and Andre Rosendo *

Living Machines Laboratory, School of Information Science and Technology, ShanghaiTech University,

Shanghai 201210, China; zhuwsh@shanghaitech.edu.cn

* Correspondence: arosendo@shanghaitech.edu.cn

Abstract:

Energy efﬁciency is critical for the locomotion of quadruped robots. However, energy

efﬁciency values found in simulations do not transfer adequately to the real world. To address

this issue, we present a novel method, named Policy Search Transfer Optimization (PSTO), which

combines deep reinforcement learning and optimization to create energy-efﬁcient locomotion for

quadruped robots in the real world. The deep reinforcement learning and policy search process are

performed by the TD3 algorithm and the policy is transferred to the open-loop control trajectory

further optimized by numerical methods, and conducted on the robot in the real world. In order to

ensure the high uniformity of the simulation results and the behavior of the hardware platform, we

introduce and validate the accurate model in simulation including consistent size and ﬁne-tuning

parameters. We then validate those results with real-world experiments on the quadruped robot

Ant by executing dynamic walking gaits with different leg lengths and numbers of ampliﬁcations.

We analyze the results and show that our methods can outperform the control method provided

by the state-of-the-art policy search algorithm TD3 and sinusoid function on both energy efﬁciency

and speed.

Keywords: machine learning; robot locomotion; energy efﬁciency; deep reinforcement learning

1. Introduction

Legged locomotion [

] is essential for robots to traverse difficult environments with

agility and grace. However, the energy efficiency of mobile robots still have room for

improvement when performing a dynamic locomotion. Classical approaches often require

extensive experience of the structure and massive manual tuning of parameteric choices [

Recently, learning-based approaches, especially deep reinforcement learning methods,

have achieved tremendous progress in controlling robots [

–

]. Policy search [

], as a

subﬁeld of deep reinforcement learning, is widely studied in recent years. Numbers of

policy search algorithms have appeared to improve the performance, sample efﬁciency

while reducing the entropy in the learning process e.g., DDPG [

], TRPO [

], PPO [

SAC [

], and TD3 [

]. These algorithms automate the training process and produce

feasible locomotion for robots without much human interference.

While these methods have demonstrated promising results in simulation, the transfer

of those to the real world often performs poorly, including substandard energy efﬁciency

and low speed, which is mainly caused by the reality gap [

]. Model discrepancies between

the simulated and the real physical system, unmodeled dynamics, wrong simulation

parameters, and numerical errors contribute to this gap. In three-dimensional locomotion,

this gap will be even ampliﬁed because the subtle difference of the contact situations

between the simulation and the real world could be magniﬁed and forked to unexpected

consequences. With this gap, robots could conduct poor performance, increase energy

consumption, and even damage themselves. Works on narrowing the reality gap are

essential for machine learning on robots.

Machines 2022, 10, 185. https://doi.org/10.3390/machines10030185 https://www.mdpi.com/journal/machines

资源描述：

当前文档最多预览五页，下载文档查看全文

侵权申诉



1 1 2 3 4 5 / 13



此文档下载收益归作者所有

当前文档最多预览五页，下载文档查看全文

版权提示

温馨提示：
1. 部分包含数学公式或PPT动画的文件，查看预览时可能会显示错乱或异常，文件下载后无此问题，请放心下载。
2. 本文档由用户上传，版权归属用户，天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容，确认文档内容符合您的需求后进行下载，若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误，付费完成后未能成功下载的用户请联系客服处理。

大家都在看

近期热门

PSTO学习四足机器人的节能运动

最近更新

大家都在看

相关文章

相关标签