A deep reinforcement learning framework for solving two-stage stochastic programs
dc.contributor.author | Yilmaz, Dogacan | en |
dc.contributor.author | Büyüktahtakın, İ. Esra | en |
dc.date.accessioned | 2025-03-20T19:14:00Z | en |
dc.date.available | 2025-03-20T19:14:00Z | en |
dc.date.issued | 2023-05-31 | en |
dc.description.abstract | In this study, we present a deep reinforcement learning framework for solving scenario-based two-stage stochastic programming problems. Stochastic programs have numerous real-time applications, such as scheduling, disaster management, and route planning, yet they are computationally challenging to solve and require specially designed solution strategies such as hand-crafted heuristics. To the extent of our knowledge, this is the first study that decomposes two-stage stochastic programs with a multi-agent structure in a deep reinforcement learning algorithmic framework to solve them faster. Specifically, we propose a general two-stage deep reinforcement learning framework that can generate high-quality solutions within a fraction of a second, in which two different learning agents sequentially learn to solve each stage of the problem. The first-stage agent is trained with the feedback of the second-stage agent using a new policy gradient formulation since the decisions are interconnected through the stages. We demonstrate our framework through a general multi-dimensional stochastic knapsack problem. The results show that solution time can be reduced up to five orders of magnitude with sufficiently good optimality gaps of around 7%. Also, a decision-making agent can be trained with a few scenarios and can solve problems with many scenarios and achieve a significant reduction in solution times. Considering the vast state and action space of the problem of interest, the results show a promising direction for generating fast solutions for stochastic online optimization problems without expert knowledge. | en |
dc.description.version | Accepted version | en |
dc.format.extent | Pages 1993-2020 | en |
dc.format.mimetype | application/pdf | en |
dc.identifier.doi | https://doi.org/10.1007/s11590-023-02009-5 | en |
dc.identifier.eissn | 1862-4480 | en |
dc.identifier.issn | 1862-4472 | en |
dc.identifier.issue | 9 | en |
dc.identifier.orcid | Buyuktahtakin Toy, Esra [0000-0001-8928-2638] | en |
dc.identifier.uri | https://hdl.handle.net/10919/124892 | en |
dc.identifier.volume | 18 | en |
dc.language.iso | en | en |
dc.publisher | Springer | en |
dc.rights | In Copyright | en |
dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
dc.title | A deep reinforcement learning framework for solving two-stage stochastic programs | en |
dc.title.serial | Optimization Letters | en |
dc.type | Article - Refereed | en |
dc.type.dcmitype | Text | en |
pubs.organisational-group | Virginia Tech | en |
pubs.organisational-group | Virginia Tech/Engineering | en |
pubs.organisational-group | Virginia Tech/Engineering/Industrial and Systems Engineering | en |
pubs.organisational-group | Virginia Tech/All T&R Faculty | en |
pubs.organisational-group | Virginia Tech/Engineering/COE T&R Faculty | en |