Considerations of Reinforcement Learning within Real-Time Wireless Communication Systems

TR Number



Journal Title

Journal ISSN

Volume Title


Virginia Tech


Afflicted heavily by spectrum congestion, the unpredictable, dynamic conditions of the radio frequency (RF) spectrum has increasingly become a major obstacle for devices today. More specifically, a significant threat existing within this kind of environment is interference caused by collisions, which is increasingly unavoidable in an overcrowded spectrum. Thus, these devices require a way to avoid such events. Cognitive radios (CR) were proposed as a solution through its transmission adaptability and decision-making capabilities within a radio. Through spectrum sensing, CRs are able to capture the current condition of the RF spectrum and based on its decision-making strategy, interpret these results to make an informed decision on what to do next to optimize its own communication. With the emergence of artificial intelligence, one such decision-making strategy CRs can utilize is Reinforcement Learning (RL). Unlike standard adaptive radios, CRs equipped with RL can predict the conditions of the RF spectrum, and using these predictions, understand what it must do in the future to operate optimally.

Recognizing the usefulness of RL in hard-to-predict environments, such as the RF spectrum, research of RL within CRs have become more popular over the past decade, especially for interference mitigation. However, the existing literature neglects to confront the possible limitations that pose a threat to the proper implementation of RL in RF systems. Therefore, this thesis is motivated to investigate what limitations in real-time communication systems can hinder the performance of RL, and as a result of these limitations, emphasize the considerations that should be a focus in the design and implementation of radio frequency reinforcement learning (RFRL) systems. The effects of latency, power, wireless channel impairments, different transmission protocols, and different spectrum sensing detectors are among the possible limitations simulated and analyzed within this work that are not typically considered within simulation-based prior art. To perform this investigation, a representative real-time OFDM transmit/receive chain is implemented within the GNU Radio framework. The system, operating over-the-air through USRPs, leverages reinforcement learning, e.g. Q-Learning, in order to avoid interference with other spectrum users. Performance analysis of this representative system provides a systematic approach for helping to predict limiting factors within an implemented real-time system and thus, aim to provide guidance on how to design these systems with these practical limitations in mind.



Wireless Communications, Reinforcement Learning, Intelligent Radio, Spectrum Avoidance