IEEE TRANSACTIONS ON CONTROL SYSTEMS TECHNOLOGY 1
Cooperative guidance of multiple missiles: a hybrid
co-evolutionary approach
Xuejing Lan, Junda Chen, Zhijia Zhao, Member, IEEE, and Tao Zou
Abstract—Cooperative guidance of multiple missiles is a
challenging task with rigorous constraints of time and space
consensus, especially when attacking dynamic targets. In this
paper, the cooperative guidance task is described as a distributed
multi-objective cooperative optimization problem. To address
the issues of non-stationarity and continuous control faced by
cooperative guidance, the natural evolutionary strategy (NES) is
improved along with an elitist adaptive learning technique to
develop a novel natural co-evolutionary strategy (NCES). The
gradients of the original evolutionary strategy are rescaled to
reduce the estimation bias caused by the interaction between the
multiple missiles. A hybrid co-evolutionary cooperative guidance
law (HCCGL) is then developed by integrating the highly scalable
co-evolutionary strategy and the proportional guidance law,
with detailed convergence proof provided. Finally, simulations
demonstrated the effectiveness and superiority of this guidance
law in solving cooperative guidance tasks with high accuracy,
with potential applications in other multi-objective optimization,
dynamic optimization, and distributed control scenarios.
Index Terms—Optimal control; cooperative guidance; evolu-
tionary strategy; multi-objective optimization
I. INTRODUCTION
M
ODERN penetration of air defense systems of the target
requires coordinated attacks with multiple missiles.
However, the rapid development of detection technologies and
close-in weapon systems (CIWS) has decreased the chances
of successful impact with a single conventional missile. [1].
In addition to increasing the difficulty of interception, the
cooperative guidance strategy of multiple missiles is also
crucial to the lethal effect of the final impact. Usually, the
cooperative guidance of multiple missiles belongs to the phase
of terminal guidance, where accurate target information can be
obtained with active radar systems or other detection devices.
The existing cooperative guidance laws can be roughly divided
into two categories. One is the analytical method to find closed-
form solutions, which is mainly based on sliding mode control,
optimal control, and multi-agent consensus theory. The other
is the intelligent method which generally adopts heuristic
intelligent optimization algorithm and reinforcement learning
(RL) theory.
The analytical cooperative guidance method has been proven
to be robust and efficient for practical application [2]–[6].
Based on fundamental proportional navigation (PN), Jeon et.
The author(s) received no financial support for the research, authorship,
and/or publication of this article. (Xuejing Lan and Junda Chen contributed
equally to this work)(Corresponding author: Zhijia Zhao)
X. J. Lan, J. D. Chen, Z. J. Zhao, and T. Zou are with the School of
Mechanical and Electrical Engineering, Guangzhou University, Guangzhou
510006, China (e-mail: lanxj@gzhu.edu.cn; CJD@e.gzhu.edu.cn; zhjzhaos-
cut@163.com; tzou@gzhu.edu.cn).
al developed cooperative proportional navigation (CPN) where
the on-board time-to-go of the missile is used as the navigation
gain [1]. It is a simple but effective approach for achieving time
consensus. Ma developed a composite guidance law, which can
be decomposed into the direction along the line of sight (LOS)
and the direction perpendicular to LOS [2], corresponding to
time and space cooperative respectively. Furthermore, time
cooperative control is achieved with the combination of PNG
and impact time error feedback [7], where the undirected
topology is adopted to establish communication relationships.
Based on the optimal control approach, a variant of the
hyperbolic tangent function is proposed in [3] to force early
control of velocity and impact angle.
However, with the increasing demand for developing high-
precision weapon systems, intelligent cooperative guidance
method is increasingly regarded as a necessary auxiliary
option. In recent years, the reinforcement learning theory has
attracted much attention because of its ability to learn online
based on environmental feedback [8]–[13]. According to the
training structures, existing reinforcement learning algorithms
for multi-agent systems can be roughly divided into four types,
which are Fully decentralized training, decentralized execution;
Fully centralized training, decentralized execution; Central-
ized training, centralized execution, and value decomposition
methods. Some of these algorithms have achieved satisfactory
results in coping with problems with low complexity and
accuracy requirements. In [14], [5], and [15], the state-of-
the-art reinforcement learning frameworks have demonstrated
their effectiveness in the guidance task. Zhang et.al proposed
a gradient-descent-based reinforcement learning method in
the actor-critic framework and achieved consensus control for
multi-agent systems by following a tracking leader [16]. But
the two challenges of Nonstaionarity and Partial Observability
[17] will lead to saturated output or coordination loss of multi-
agent systems, which greatly reduces the accuracy of the value
function. In addition, the use of value function in reinforcement
learning is not suitable for continuous control tasks with
large search spaces. Thus, these limitations of RL impede the
development of reinforcement learning in cooperative guidance.
It is an excellent way to solve the above problems by
removing the value function of reinforcement learning and
optimizing in solution space with evolutionary strategy (ES),
which is more robust and invariant to real-time rewards because
it optimizes towards the objective function directly [18]. More-
over, as described in [19], ES is tolerant of long horizontal and
implicit solutions, which is exactly consistent with the need for
cooperative guidance. The natural evolutionary strategy (NES)
is the latest branch of ES, and shows good performance in
arXiv:2208.07156v2 [cs.NE] 15 Apr 2023