The Application of Reinforcement Learning for Interceptor Guidance
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The progression of hypersonic vehicle research and development has presented a challenge to modern missile defenses. These attack vehicles travel at speeds of Mach 5+, have low trajectories that result in late radar detections, and can be highly maneuverable. To counter this, new interceptors must be developed. This work explores using machine learning for the guidance of these interceptors through applied steering commands, with the intent to improve upon traditional guidance methods. Specifically, proximal policy optimization (PPO) was selected as the reinforcement learning algorithm due to its advanced and efficient nature, as well as its successful use in related work. A framework was developed and tuned for the interceptor guidance problem, combining the PPO algorithm with a specialized reward shaping method and tuned parameters for the engagements of interest. Low-fidelity vehicle models were used to reduce training time and narrow the scope of work towards improving the guidance algorithms. Models were trained and tested on several case studies to understand the benefits and limitations of an intelligently guided interceptor. Performance comparisons between the trained guidance models and traditional methods of guidance were made for cases with supersonic, hypersonic, weaving, and dynamically evasive attack vehicles. The models were able to perform well with initial conditions outside of their training sets, but more significant differences in the engagements needed to be included in training. The models were therefore found to be more rigid than desired, limiting their effectiveness in new engagements. Compared to the traditional methods, the PPO-guided interceptor was able to intercept the attacker faster in most cases, and had a smaller miss distance against several evasive attackers. However, the PPO-guided interceptor had a lower percent kill against nonmaneuvering attackers, and typically required larger lateral acceleration commands than traditional methods. This work acts as a strong foundation for using machine learning for guiding missile interceptors, and presents both benefits and limitations of a current implementation. Proposals for future efforts involve increasing the fidelity and complexity of the vehicles, engagements, and guidance methods.