A deep reinforcement learning framework for solving two-stage stochastic programs

dc.contributor.authorYilmaz, Dogacanen
dc.contributor.authorBüyüktahtakın, İ. Esraen
dc.date.accessioned2025-03-20T19:14:00Zen
dc.date.available2025-03-20T19:14:00Zen
dc.date.issued2023-05-31en
dc.description.abstractIn this study, we present a deep reinforcement learning framework for solving scenario-based two-stage stochastic programming problems. Stochastic programs have numerous real-time applications, such as scheduling, disaster management, and route planning, yet they are computationally challenging to solve and require specially designed solution strategies such as hand-crafted heuristics. To the extent of our knowledge, this is the first study that decomposes two-stage stochastic programs with a multi-agent structure in a deep reinforcement learning algorithmic framework to solve them faster. Specifically, we propose a general two-stage deep reinforcement learning framework that can generate high-quality solutions within a fraction of a second, in which two different learning agents sequentially learn to solve each stage of the problem. The first-stage agent is trained with the feedback of the second-stage agent using a new policy gradient formulation since the decisions are interconnected through the stages. We demonstrate our framework through a general multi-dimensional stochastic knapsack problem. The results show that solution time can be reduced up to five orders of magnitude with sufficiently good optimality gaps of around 7%. Also, a decision-making agent can be trained with a few scenarios and can solve problems with many scenarios and achieve a significant reduction in solution times. Considering the vast state and action space of the problem of interest, the results show a promising direction for generating fast solutions for stochastic online optimization problems without expert knowledge.en
dc.description.versionAccepted versionen
dc.format.extentPages 1993-2020en
dc.format.mimetypeapplication/pdfen
dc.identifier.doihttps://doi.org/10.1007/s11590-023-02009-5en
dc.identifier.eissn1862-4480en
dc.identifier.issn1862-4472en
dc.identifier.issue9en
dc.identifier.orcidBuyuktahtakin Toy, Esra [0000-0001-8928-2638]en
dc.identifier.urihttps://hdl.handle.net/10919/124892en
dc.identifier.volume18en
dc.language.isoenen
dc.publisherSpringeren
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.titleA deep reinforcement learning framework for solving two-stage stochastic programsen
dc.title.serialOptimization Lettersen
dc.typeArticle - Refereeden
dc.type.dcmitypeTexten
pubs.organisational-groupVirginia Techen
pubs.organisational-groupVirginia Tech/Engineeringen
pubs.organisational-groupVirginia Tech/Engineering/Industrial and Systems Engineeringen
pubs.organisational-groupVirginia Tech/All T&R Facultyen
pubs.organisational-groupVirginia Tech/Engineering/COE T&R Facultyen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
template_v4.pdf
Size:
1.35 MB
Format:
Adobe Portable Document Format
Description:
Accepted version
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Plain Text
Description: