Sample Complexity of Incremental Policy Gradient Methods for Solving Multi-Task Reinforcement Learning

Bai, Yitao

Sample Complexity of Incremental Policy Gradient Methods for Solving Multi-Task Reinforcement Learning

Files

Master_Thesis_Yitao_Bai.pdf (1.69 MB)

Downloads: 284

Date

2024-04-05

Authors

Bai, Yitao

Publisher

Virginia Tech

Abstract

We consider a multi-task learning problem, where an agent is presented a number of N reinforcement learning tasks. To solve this problem, we are interested in studying the gradient approach, which iteratively updates an estimate of the optimal policy using the gradients of the value functions. The classic policy gradient method, however, may be expensive to implement in the multi-task settings as it requires access to the gradients of all the tasks at every iteration. To circumvent this issue, in this paper we propose to study an incremental policy gradient method, where the agent only uses the gradient of only one task at each iteration. Our main contribution is to provide theoretical results to characterize the performance of the proposed method. In particular, we show that incremental policy gradient methods converge to the optimal value of the multi-task reinforcement learning objectives at a sublinear rate O(1/√k), where k is the number of iterations. To illustrate its performance, we apply the proposed method to solve a simple multi-task variant of GridWorld problems, where an agent seeks to find an policy to navigate effectively in different environments.

Keywords

Markov decision processes, Multi-task reinforcement learning

Persistent link

https://hdl.handle.net/10919/118699

Collections

Masters Theses

Full item page

Sample Complexity of Incremental Policy Gradient Methods for Solving Multi-Task Reinforcement Learning

Files

TR Number

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

Persistent link

Collections