An air combat maneuver decision-making approach using coupled reward in deep reinforcement learning
Abstract In the domain of unmanned air combat, achieving efficient autonomous maneuvering decisions presents challenges. Deep Reinforcement learning(DRL) is one of the approaches to tackle this problem. The final performance of the DRL algorithm is directly affected by the design of the reward funct...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2025-06-01
|
Series: | Complex & Intelligent Systems |
Subjects: | |
Online Access: | https://doi.org/10.1007/s40747-025-01992-9 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Abstract In the domain of unmanned air combat, achieving efficient autonomous maneuvering decisions presents challenges. Deep Reinforcement learning(DRL) is one of the approaches to tackle this problem. The final performance of the DRL algorithm is directly affected by the design of the reward functions. However, the performance and convergence speed of the models suffer from unreasonable reward weights. Therefore, a method named Coupled Reward-Deep Reinforcement Learning(CR-DRL) is introduced to deal with this problem. Specifically, we propose a novel coupled-weight reward function for DRL within the air combat framework. The novel reward function integrates angle and distance so that our DRL maneuver decision model can be trained faster and perform better compared to that of the models use conventional reward functions. Additionally, we establish a brand new competitive training framework designed to enhance the performance of our model against personalized opponents. The experimental results show that our CR-DRL model outperforms the traditional model that uses the fixed-weight reward functions in this training framework, with a 6.3% increase in average reward in fixed scenarios and a 22.8% increase in changeable scenarios. Moreover, the performance of our model continually improves with the increase of iterations, ultimately yielding a certain degree of generalization performance against similar opponents. Finally, we develop a simulation environment that supports real-time air combat based on Unity3D, called Airfightsim, to demonstrate the performance of the proposed algorithm. |
---|---|
ISSN: | 2199-4536 2198-6053 |