具有随机作业到达的动态柔性车间调度的深度强化学习

ID：39081

阅读量：1

大小：3.01 MB

页数：20页

时间：2023-03-14

金币：2

上传者：战必胜



 

Citation: Chang, J.; Yu, D.; Hu, Y.; He,

W.; Yu, H. Deep Reinforcement

Learning for Dynamic Flexible Job

Shop Scheduling with Random Job

Arrival. Processes 2022, 10, 760.

https://doi.org/10.3390/pr10040760

Academic Editors: Kelvin K.L. Wong,

Dhanjoo N. Ghista, Andrew W.H. Ip

and Wenjun (Chris) Zhang

Received: 16 March 2022

Accepted: 11 April 2022

Published: 13 April 2022

Publisher’s Note: MDPI stays neutral

with regard to jurisdictional claims in

published maps and institutional afﬁl-

iations.

Licensee MDPI, Basel, Switzerland.

This article is an open access article

distributed under the terms and

conditions of the Creative Commons

Attribution (CC BY) license (https://

creativecommons.org/licenses/by/

4.0/).

processes

Article

Deep Reinforcement Learning for Dynamic Flexible Job Shop

Scheduling with Random Job Arrival

Jingru Chang

1,2,3

, Dong Yu

*, Yi Hu

2,4

, Wuwei He

1,2

and Haoyu Yu

1,2

University of Chinese Academy of Sciences, Beijing 100049, China; changjingru@neusoft.edu.cn (J.C.);

wuhewei2021@163.com (W.H.); yuhaoyu2021@sina.com (H.Y.)

Shenyang Institute of Computing Technology, Chinese Academy of Sciences, Shenyang 110168, China;

huyi@sict.ac.cn

Department of Software Engineering, Dalian Neusoft University of Information, Dalian 116023, China

Shenyang Zhongke CNC Technology Co., Ltd., Shenyang 110168, China

* Correspondence: yudong@sict.ac.cn

Abstract:

The production process of a smart factory is complex and dynamic. As the core of manu-

facturing management, the research into the ﬂexible job shop scheduling problem (FJSP) focuses on

optimizing scheduling decisions in real time, according to the changes in the production environment.

In this paper, deep reinforcement learning (DRL) is proposed to solve the dynamic FJSP (DFJSP) with

random job arrival, with the goal of minimizing penalties for earliness and tardiness. A double deep

Q-networks (DDQN) architecture is proposed and state features, actions and rewards are designed.

A soft

-greedy behavior policy is designed according to the scale of the problem. The experimental

results show that the proposed DRL is better than other reinforcement learning (RL) algorithms,

heuristics and metaheuristics in terms of solution quality and generalization. In addition, the soft

-greedy strategy reasonably balances exploration and exploitation, thereby improving the learning

efﬁciency of the scheduling agent. The DRL method is adaptive to the dynamic changes of the

production environment in a ﬂexible job shop, which contributes to the establishment of a ﬂexible

scheduling system with self-learning, real-time optimization and intelligent decision-making.

Keywords:

smart factory; ﬂexible job shop scheduling problem; deep reinforcement learning; random

job arrival; penalties for earliness and tardiness; double deep Q-networks

1. Introduction

Industry 4.0, also called the “smart factory” [

], focuses on the integration of advanced

technologies such as the Internet of Things, big data and artiﬁcial intelligence with en-

terprise resource planning, manufacturing execution management and process control

management. Thus, a smart factory has the capabilities of autonomous perception, analysis,

reasoning, decision-making and control. The ﬂexible job shop scheduling problem (FJSP)

is an extension of the traditional job shop scheduling problem (JSP). The FJSP provides

possibilities and guarantees low variation in diversiﬁed and differentiated manufacturing,

which is widely used in the semiconductor manufacturing process, the automobile assem-

bly process, mechanical manufacturing systems, etc. [

]. As the core of manufacturing

execution management and process control management, the real-time optimization and

control of FJSP provides increased ﬂexibility in the management of a smart factory, aiming

to improve factory productivity and the efﬁcient utilization of resources in real time [3].

The FJSP breaks through the uniqueness restriction of production resources. Each

operation can be assigned on one or more available machines and the processing time is

different for different machines [

]. The FJSP reduces the machine constraints and expands

the size of the feasible solution search space, so it is a strong NP-hard problem that is more

complex than the JSP [

]. So far, a large number of studies on the FJSP have assumed

that the scheduling takes place in a static production environment, where the shop ﬂoor

Processes 2022, 10, 760. https://doi.org/10.3390/pr10040760 https://www.mdpi.com/journal/processes

资源描述：

当前文档最多预览五页，下载文档查看全文

侵权申诉



1 1 2 3 4 5 / 20



此文档下载收益归作者所有

当前文档最多预览五页，下载文档查看全文

版权提示

温馨提示：
1. 部分包含数学公式或PPT动画的文件，查看预览时可能会显示错乱或异常，文件下载后无此问题，请放心下载。
2. 本文档由用户上传，版权归属用户，天天文库负责整理代发布。如果您对本文档版权有争议请及时联系客服。
3. 下载前请仔细阅读文档内容，确认文档内容符合您的需求后进行下载，若出现内容与标题不符可向本站投诉处理。
4. 下载文档时可能由于网络波动等原因无法下载或下载错误，付费完成后未能成功下载的用户请联系客服处理。

大家都在看

近期热门

具有随机作业到达的动态柔性车间调度的深度强化学习

最近更新

大家都在看

相关文章

相关标签