Reinforcement Learning with Gaussian Processes for Unmanned Aerial Vehicle Navigation
We study the problem of Reinforcement Learning (RL) for Unmanned Aerial Vehicle (UAV) navigation with the smallest number of real world samples possible. This work is motivated by applications of learning autonomous navigation for aerial robots in structural inspec- tion. A naive RL implementation suffers from curse of dimensionality in large continuous state spaces. Gaussian Processes (GPs) exploit the spatial correlation to approximate state- action transition dynamics or value function in large state spaces. By incorporating GPs in naive Q-learning we achieve better performance in smaller number of samples. The evalua- tion is performed using simulations with an aerial robot. We also present a Multi-Fidelity Reinforcement Learning (MFRL) algorithm that leverages Gaussian Processes to learn the optimal policy in a real world environment leveraging samples gathered from a lower fidelity simulator. In MFRL, an agent uses multiple simulators of the real environment to perform actions. With multiple levels of fidelity in a simulator chain, the number of samples used in successively higher simulators can be reduced.