Article
A New Cache Update Scheme Using Reinforcement Learning
for Coded Video Streaming Systems
Yu-Sin Kim
1
, Jeong-Min Lee
2
, Jong-Yeol Ryu
2
and Tae-Won Ban
2,
*
Citation: Kim, Y.-S.; Lee, J.-M.; Ryu,
J.-Y.; Ban, T.-W. A New Cache Update
Scheme Using Reinforcement
Learning for Coded Video Streaming
Systems. Sensors 2021, 21, 2867.
https://doi.org/10.3390/s21082867
Academic Editor: Nikolaos Thomos
Received: 10 March 2021
Accepted: 15 April 2021
Published: 19 April 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
1
Algorithm Team, Carvi, Seoul 08513, Korea; usin1216@carvi.co.kr
2
Department of Information and Communication Engineering, Gyeongsang National University,
Gyeongnam 53064, Korea; ljmin200002@gmail.com (J.-M.L.); jongyeol_ryu@gnu.ac.kr (J.-Y.R.)
* Correspondence: twban35@gnu.ac.kr
Abstract:
As the demand for video streaming has been rapidly increasing recently, new technologies
for improving the efficiency of video streaming have attracted much attention. In this paper, we
thus investigate how to improve the efficiency of video streaming by using clients’ cache storage
considering exclusive OR (XOR) coding-based video streaming where multiple different video
contents can be simultaneously transmitted in one transmission as long as prerequisite conditions are
satisfied, and the efficiency of video streaming can be thus significantly enhanced. We also propose a
new cache update scheme using reinforcement learning. The proposed scheme uses a K-actor-critic
(
K
-AC) network that can mitigate the disadvantage of actor-critic networks by yielding
K
candidate
outputs and by selecting the final output with the highest value out of the
K
candidates. The
K
-AC
exists in each client, and each client can train it by using only locally available information without
any feedback or signaling so that the proposed cache update scheme is a completely decentralized
scheme. The performance of the proposed cache update scheme was analyzed in terms of the
average number of transmissions for XOR coding-based video streaming and was compared to that
of conventional cache update schemes. Our numerical results show that the proposed cache update
scheme can reduce the number of transmissions up to 24% when the number of videos is 100, the
number of clients is 50, and the cache size is 5.
Keywords: streaming; multimedia; reinforcement learning; cache; exclusive OR
1. Introduction
In recent years, Internet traffic has been rapidly increasing and is expected to increase
more rapidly in the future [
1
,
2
]. In particular, it is also expected that video streaming traffic
will account for 82% of the global Internet traffic by 2022 due to the wide popularity of
various video streaming platforms such as YouTube [
1
]. This trend is more pronounced in
mobile networks, and many advanced techniques have been thus investigated to increase
the capacity of next-generation mobile communication networks [
3
–
5
]. Along with many
technologies to increase network capacity by using a wide bandwidth or by increasing
spectral efficiency, other technologies for reducing network traffic are also attracting much
attention as another alternative [
6
,
7
]. Multicast (MC) transmission can reduce network
traffic by transmitting a video to multiple clients in one transmission if the clients requested
the same video at the same time [
6
]. Proxy servers with cache can significantly reduce net-
work traffic, and bandwidth optimization for real-time video traffic transmission through a
proxy server was investigated in [
7
]. In particular, MC-aware caching can better exploit the
available cache space and can yield a gain of 19% over existing caching
schemes [6]
. Many
studies have studied how to reduce network traffic by using the transmitters’ cache storage,
while the low cost and large capacity of storage motivated some studies to focus on the
clients’ cache storage [
8
–
13
]. In this paper, we thus investigate a new video streaming
system using clients’ cache and XOR-based index coding. In the new video streaming
Sensors 2021, 21, 2867. https://doi.org/10.3390/s21082867 https://www.mdpi.com/journal/sensors