RAND：利用构造性模拟进行强化学习（2024） 8页

VIP文档

ID：70835

阅读量：6

大小：0.57 MB

页数：8页

时间：2024-08-09

金币：10

上传者：PASHU

GARY J. BRIGGS

Harnessing Constructive

Simulations for

Reinforcement Learning

einforcement learning (RL) is a powerful artificial intelligence technique for the develop-

ment of software agents that make intelligent decisions and exhibit complex behaviors. RL

works by applying feedback from the environment in the form of rewards and penalties to

induce agents to learn how to succeed in that environment. It has famously been used to

train agents to defeat human players in classic games of strategy, such as Go and Chess.

RL training usually takes place within an RL gym, an artificial environment optimized for such

training, in which the agent can be run rapidly through the same scenario many times.

It can take

millions of iterations to train, test, and refine software agents using these methods, so having a fast

and efficient RL gym is essential to

meet development timelines. Interact-

ing with unoptimized simulations—

or the real world—would be far

slower and more expensive, possibly

infeasibly so.

Combining the intelligence of

modern RL-trained agents with the

depth of established constructive

simulations could greatly improve the

analytic power of these simulations,

enabling researchers to represent

more-complicated interactions and

more-sophisticated environments.

However, although most state-of-

the-art RL gyms are written in the

Python programming language, most

constructive simulations are not. In

particular, many military research-

ers would like to be able to use

Python-defined RL agents within the

KEY TAKEAWAYS

■ RAND researchers have developed a flexible software harness

that enables the use of state-of-the-art reinforcement learning

(RL) methods in many existing constructive simulations without

requiring significant additional coding.

■ RL is a powerful artificial intelligence technique that can be used

to train software agents in constructive simulations to make deci-

sions desirable by the operator or behave more realistically.

■ Most modern RL gyms (for training software agents) are written in

Python, whereas some of the most widely used constructive sim-

ulations, such as the Air Force Research Laboratory’s Advanced

Framework for Simulation, Integration, and Modeling (AFSIM), are

written in other programming languages.

■ The RAND RL software harness isolates agent training from agent

employment, allowing researchers to use agents trained in modern

RL gyms within existing constructive simulations written in C++.

■ Researchers at RAND have demonstrated the harness in AFSIM

for the case of an aircraft attempting to penetrate an adversary’s

integrated air defense system.

■ RAND’s RL software harness has been made available to all autho-

rized users on the Air Force Research Laboratory’s AFSIM portal.

Research Report

资源描述：

本报告介绍了兰德公司的研究人员如何开发出一种灵活的软件工具，该工具能够在许多现有的构造性模拟中使用最先进的强化学习方法，而不需要大量的额外编码。

当前文档最多预览五页，下载文档查看全文

侵权申诉



1 1 2 3 4 5 / 8



此文档下载收益归作者所有

当前文档最多预览五页，下载文档查看全文

版权提示

温馨提示：
1. 部分包含数学公式或PPT动画的文件，查看预览时可能会显示错乱或异常，文件下载后无此问题，请放心下载。
2. 本文档由用户上传，版权归属用户，天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容，确认文档内容符合您的需求后进行下载，若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误，付费完成后未能成功下载的用户请联系客服处理。

大家都在看

近期热门

RAND：利用构造性模拟进行强化学习（2024） 8页

最近更新

大家都在看

相关文章

相关标签