Reinforcement Learning with Gaussian Processes for Unmanned Aerial Vehicle Navigation

dc.contributor.authorGondhalekar, Nahush Rameshen
dc.contributor.committeechairTokekar, Pratapen
dc.contributor.committeememberZeng, Haiboen
dc.contributor.committeememberAbbott, A. Lynnen
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2017-08-04T08:00:43Zen
dc.date.available2017-08-04T08:00:43Zen
dc.date.issued2017-08-03en
dc.description.abstractWe study the problem of Reinforcement Learning (RL) for Unmanned Aerial Vehicle (UAV) navigation with the smallest number of real world samples possible. This work is motivated by applications of learning autonomous navigation for aerial robots in structural inspec- tion. A naive RL implementation suffers from curse of dimensionality in large continuous state spaces. Gaussian Processes (GPs) exploit the spatial correlation to approximate state- action transition dynamics or value function in large state spaces. By incorporating GPs in naive Q-learning we achieve better performance in smaller number of samples. The evalua- tion is performed using simulations with an aerial robot. We also present a Multi-Fidelity Reinforcement Learning (MFRL) algorithm that leverages Gaussian Processes to learn the optimal policy in a real world environment leveraging samples gathered from a lower fidelity simulator. In MFRL, an agent uses multiple simulators of the real environment to perform actions. With multiple levels of fidelity in a simulator chain, the number of samples used in successively higher simulators can be reduced.en
dc.description.abstractgeneralIncreasing development in the field of infrastructure inspection using Unmanned Aerial Vehicles (UAVs) has been seen in the recent years. This thesis presents work related to UAV navigation using Reinforcement Learning (RL) with the smallest number of real world samples. A naive RL implementation suffers from the curse of dimensionality in large continuous state spaces. Gaussian Processes (GPs) exploit the spatial correlation to approximate state-action transition dynamics or value function in large state spaces. By incorporating GPs in naive Q-learning we achieve better performance in smaller number of samples. The evaluation is performed using simulations with an aerial robot. We also present a Multi-Fidelity Reinforcement Learning (MFRL) algorithm that leverages Gaussian Processes to learn the optimal policy in a real world environment leveraging samples gathered from a lower fidelity simulator. In MFRL, an agent uses multiple simulators of the real environment to perform actions. With multiple levels of fidelity in a simulator chain, the number of samples used in successively higher simulators can be reduced. By developing a bidirectional simulator chain, we try to provide a learning platform for the robots to safely learn required skills in the smallest possible number of real world samples possible.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:12330en
dc.identifier.urihttp://hdl.handle.net/10919/78667en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectReinforcement Learningen
dc.subjectGaussian Processesen
dc.subjectUnmanned Aerial Vehicle Navigationen
dc.titleReinforcement Learning with Gaussian Processes for Unmanned Aerial Vehicle Navigationen
dc.typeThesisen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Gondhalekar_NR_T_2017.pdf
Size:
42.29 MB
Format:
Adobe Portable Document Format

Collections