Sample Complexity of Incremental Policy Gradient Methods for Solving Multi-Task Reinforcement Learning

Bai, Yitao

Sample Complexity of Incremental Policy Gradient Methods for Solving Multi-Task Reinforcement Learning

dc.contributor.author	Bai, Yitao	en
dc.contributor.committeechair	Doan, Thinh T.	en
dc.contributor.committeemember	Stilwell, Daniel J.	en
dc.contributor.committeemember	Jin, Ming	en
dc.contributor.department	Electrical and Computer Engineering	en
dc.date.accessioned	2024-04-30T12:20:32Z	en
dc.date.available	2024-04-30T12:20:32Z	en
dc.date.issued	2024-04-05	en
dc.description.abstract	We consider a multi-task learning problem, where an agent is presented a number of N reinforcement learning tasks. To solve this problem, we are interested in studying the gradient approach, which iteratively updates an estimate of the optimal policy using the gradients of the value functions. The classic policy gradient method, however, may be expensive to implement in the multi-task settings as it requires access to the gradients of all the tasks at every iteration. To circumvent this issue, in this paper we propose to study an incremental policy gradient method, where the agent only uses the gradient of only one task at each iteration. Our main contribution is to provide theoretical results to characterize the performance of the proposed method. In particular, we show that incremental policy gradient methods converge to the optimal value of the multi-task reinforcement learning objectives at a sublinear rate O(1/√k), where k is the number of iterations. To illustrate its performance, we apply the proposed method to solve a simple multi-task variant of GridWorld problems, where an agent seeks to find an policy to navigate effectively in different environments.	en
dc.description.abstractgeneral	First, we introduce a popular machine learning technique called Reinforcement Learning (RL), where an agent, such as a robot, uses a policy to choose an action, like moving forward, based on observations from sensors like cameras. The agent receives a reward that helps judge if the policy is good or bad. The objective of the agent is to find a policy that maximizes the cumulative reward it receives by repeating the above process. RL has many applications, including Cruise autonomous cars, Google industry automation, training ChatGPT language models, and Walmart inventory management. However, RL suffers from task sensitivity and requires a lot of training data. For example, if the task changes slightly, the agent needs to train the policy from the beginning. This motivates the technique called Multi-Task Reinforcement Learning (MTRL), where different tasks give different rewards and the agent maximizes the sum of cumulative rewards of all the tasks. We focus on the incremental setting where the agent can only access the tasks one by one randomly. In this case, we only need one agent and it is not required to know which task it is performing. We show that the incremental policy gradient methods we proposed converge to the optimal value of the MTRL objectives at a sublinear rate O(1/ √ k), where k is the number of iterations. To illustrate its performance, we apply the proposed method to solve a simple multi-task variant of GridWorld problems, where an agent seeks to find an policy to navigate effectively in different environments.	en
dc.description.degree	Master of Science	en
dc.format.medium	ETD	en
dc.format.mimetype	application/pdf	en
dc.identifier.uri	https://hdl.handle.net/10919/118699	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	CC0 1.0 Universal	en
dc.rights.uri	http://creativecommons.org/publicdomain/zero/1.0/	en
dc.subject	Markov decision processes	en
dc.subject	Multi-task reinforcement learning	en
dc.title	Sample Complexity of Incremental Policy Gradient Methods for Solving Multi-Task Reinforcement Learning	en
dc.type	Thesis	en
dc.type.dcmitype	Text	en
thesis.degree.discipline	Electrical Engineering	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.name	Master of Science	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Master_Thesis_Yitao_Bai.pdf
Size:: 1.69 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.5 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Masters Theses