Citation: Qi, Q.; Lin, W.; Guo, B.;
Chen, J.; Deng, C.; Lin, G.; Sun, X.;
Chen, Y. Augmented Lagrangian-
Based Reinforcement Learning
for Network Slicing in IIoT.
Electronics 2022, 11, 3385. https://
doi.org/10.3390/electronics11203385
Academic Editors:
Alexandros-Apostolos Boulogeorgos,
Panagiotis Sarigiannidis, Thomas
Lagkas, Vasileios Argyriou and
Pantelis Angelidis
Received: 8 September 2022
Accepted: 17 October 2022
Published: 19 October 2022
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2022 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
Article
Augmented Lagrangian-Based Reinforcement Learning
for Network Slicing in IIoT
Qi Qi
1
, Wenbin Lin
1
, Boyang Guo
2,
* , Jinshan Chen
1
, Chaoping Deng
1
, Guodong Lin
1
, Xin Sun
1
and Youjia Chen
2
1
State Grid Fujian Electric Power Research Institute, Fuzhou 350007, China
2
Fujian Key Lab for Intelligent Processing and Wireless Transmission of Media Information, College of Physics
and Information Engineering, Fuzhou University, Fuzhou 350108, China
* Correspondence: boyangguo_fzu@163.com
Abstract:
Network slicing enables the multiplexing of independent logical networks on the same
physical network infrastructure to provide different network services for different applications.
The resource allocation problem involved in network slicing is typically a decision-making problem,
falling within the scope of reinforcement learning. The advantage of adapting to dynamic wireless
environments makes reinforcement learning a good candidate for problem solving. In this paper,
to tackle the constrained mixed integer nonlinear programming problem in network slicing, we
propose an augmented Lagrangian-based soft actor–critic (AL-SAC) algorithm. In this algorithm, a hi-
erarchical action selection network is designed to handle the hybrid action space. More importantly,
inspired by the augmented Lagrangian method, both neural networks for Lagrange multipliers and
a penalty item are introduced to deal with the constraints. Experiment results show that the proposed
AL-SAC algorithm can strictly satisfy the constraints, and achieve better performance than other
benchmark algorithms.
Keywords:
network slicing; augmented Lagrangian; reinforcement learning; hybrid action space;
soft actor–critic (SAC)
1. Introduction
With the rapid development of industrial internet of things (IIoT), more and more
devices are connected and controlled via wireless networks. Providing precise services
for these devices to fulfill their diverse requirements becomes a fundamental issue in IIoT.
Facing this challenge, three application scenarios are defined by International Telecommu-
nication Union (ITU) and Fifth Generation Public Private Partnership (5G-PPP) [
1
,
2
], that is,
enhanced mobile broadband (eMBB), ultra-reliable low latency communications (URLLC),
and massive machine type communication (mMTC). In more detail, the eMBB scenario
provides devices with requirements on high transmission rate, such as high-definition
surveillance video in factories, whose peak rate for each camera can be greater than
10 Gbps [
3
]. mMTC refers to the scenarios, where a large number of devices connect simul-
taneously while the requirements on the transmission rate and delay are not critical [
4
].
In contrast, URLLC serves applications with a strict transmission on reliability, and latency,
such as automatic operators and controllers [5].
To satisfy these disparate scenarios within one network infrastructure, a network
slicing technique was proposed. It divides a physical network into multiple independent
logical networks [
6
,
7
], where each network slice is isolated from others and provides
one kind of network service via dedicated resource allocation. To efficiently allocate
resources and meet the dynamic of wireless networks, many intelligent algorithms have
been proposed. For instance, in [
8
], the genetic algorithm, ant colony optimization with
a genetic algorithm, and quantum genetic algorithm were used to jointly allocate radio
Electronics 2022, 11, 3385. https://doi.org/10.3390/electronics11203385 https://www.mdpi.com/journal/electronics