Military Decision Support with Actor and Critic Reinforcement Learning Agents

Keywords: Reinforcement learning, Military decision support, Actor and critic, Weapon selection, Battle damage assessment

Abstract

While the recent advanced military operational concept requires an intelligent support of command and control, Reinforcement Learning (RL) has not been actively studied in the military domain. This study points out the limitations of RL for military applications from literature review and aims at improving the understanding of RL for military decision support under the limitations. Most of all, the black box characteristic of Deep RL makes the internal process difficult to understand in addition to complex simulation tools. A scalable weapon selection RL framework is built which can be solved either by a tabular form or a neural network form. The transition of the Deep Q-Network (DQN) solution to the tabular form makes it easier to compare the result to the Q-learning solution. Furthermore, rather than using one or two RL models selectively as before, RL models are divided as an actor and a critic, and systematically compared. A random agent, Q-learning and DQN agents as a critic, a Policy Gradient (PG) agent as an actor, Trust Region Policy Optimization (TRPO) and Proximal Policy Optimization (PPO) agents as an actor-critic approach are designed, trained, and tested. The performance results show that the trained DQN and PPO agents are the best decision supporter candidates for the weapon selection RL framework.

Author Biography

Jungmok Ma, Korea National Defense University

Professor- Department of National Defense Science, Korea

Published
2024-02-26
How to Cite
Ma, J. (2024). Military Decision Support with Actor and Critic Reinforcement Learning Agents. Defence Science Journal, 74(3), 389-398. https://doi.org/10.14429/dsj.74.18864