具有随机作业到达的动态柔性车间调度的深度强化学习

ID:39081

大小:3.01 MB

页数:20页

时间:2023-03-14

金币:2

上传者:战必胜

 
Citation: Chang, J.; Yu, D.; Hu, Y.; He,
W.; Yu, H. Deep Reinforcement
Learning for Dynamic Flexible Job
Shop Scheduling with Random Job
Arrival. Processes 2022, 10, 760.
https://doi.org/10.3390/pr10040760
Academic Editors: Kelvin K.L. Wong,
Dhanjoo N. Ghista, Andrew W.H. Ip
and Wenjun (Chris) Zhang
Received: 16 March 2022
Accepted: 11 April 2022
Published: 13 April 2022
Publishers Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
processes
Article
Deep Reinforcement Learning for Dynamic Flexible Job Shop
Scheduling with Random Job Arrival
Jingru Chang
1,2,3
, Dong Yu
2,
*, Yi Hu
2,4
, Wuwei He
1,2
and Haoyu Yu
1,2
1
University of Chinese Academy of Sciences, Beijing 100049, China; changjingru@neusoft.edu.cn (J.C.);
wuhewei2021@163.com (W.H.); yuhaoyu2021@sina.com (H.Y.)
2
Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, China;
huyi@sict.ac.cn
3
Department of Software Engineering, Dalian Neusoft University of Information, Dalian 116023, China
4
Shenyang Zhongke CNC Technology Co., Ltd., Shenyang 110168, China
* Correspondence: yudong@sict.ac.cn
Abstract:
The production process of a smart factory is complex and dynamic. As the core of manu-
facturing management, the research into the flexible job shop scheduling problem (FJSP) focuses on
optimizing scheduling decisions in real time, according to the changes in the production environment.
In this paper, deep reinforcement learning (DRL) is proposed to solve the dynamic FJSP (DFJSP) with
random job arrival, with the goal of minimizing penalties for earliness and tardiness. A double deep
Q-networks (DDQN) architecture is proposed and state features, actions and rewards are designed.
A soft
ε
-greedy behavior policy is designed according to the scale of the problem. The experimental
results show that the proposed DRL is better than other reinforcement learning (RL) algorithms,
heuristics and metaheuristics in terms of solution quality and generalization. In addition, the soft
ε
-greedy strategy reasonably balances exploration and exploitation, thereby improving the learning
efficiency of the scheduling agent. The DRL method is adaptive to the dynamic changes of the
production environment in a flexible job shop, which contributes to the establishment of a flexible
scheduling system with self-learning, real-time optimization and intelligent decision-making.
Keywords:
smart factory; flexible job shop scheduling problem; deep reinforcement learning; random
job arrival; penalties for earliness and tardiness; double deep Q-networks
1. Introduction
Industry 4.0, also called the “smart factory” [
1
], focuses on the integration of advanced
technologies such as the Internet of Things, big data and artificial intelligence with en-
terprise resource planning, manufacturing execution management and process control
management. Thus, a smart factory has the capabilities of autonomous perception, analysis,
reasoning, decision-making and control. The flexible job shop scheduling problem (FJSP)
is an extension of the traditional job shop scheduling problem (JSP). The FJSP provides
possibilities and guarantees low variation in diversified and differentiated manufacturing,
which is widely used in the semiconductor manufacturing process, the automobile assem-
bly process, mechanical manufacturing systems, etc. [
2
]. As the core of manufacturing
execution management and process control management, the real-time optimization and
control of FJSP provides increased flexibility in the management of a smart factory, aiming
to improve factory productivity and the efficient utilization of resources in real time [3].
The FJSP breaks through the uniqueness restriction of production resources. Each
operation can be assigned on one or more available machines and the processing time is
different for different machines [
4
]. The FJSP reduces the machine constraints and expands
the size of the feasible solution search space, so it is a strong NP-hard problem that is more
complex than the JSP [
5
,
6
]. So far, a large number of studies on the FJSP have assumed
that the scheduling takes place in a static production environment, where the shop floor
Processes 2022, 10, 760. https://doi.org/10.3390/pr10040760 https://www.mdpi.com/journal/processes
资源描述:

当前文档最多预览五页,下载文档查看全文

此文档下载收益归作者所有

当前文档最多预览五页,下载文档查看全文
温馨提示:
1. 部分包含数学公式或PPT动画的文件,查看预览时可能会显示错乱或异常,文件下载后无此问题,请放心下载。
2. 本文档由用户上传,版权归属用户,天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容,确认文档内容符合您的需求后进行下载,若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误,付费完成后未能成功下载的用户请联系客服处理。
关闭