Data-Driven Sample Average Approximation with Covariate Information

TR Number

Date

2025-01-06

Journal Title

Journal ISSN

Volume Title

Publisher

INFORMS

Abstract

We study optimization for data-driven decision-making when we have observations of the uncertain parameters within an optimization model together with concurrent observations of covariates. The goal is to choose a decision that minimizes the expected cost conditioned on a new covariate observation. We investigate two data-driven frameworks that integrate a machine learning prediction model within a stochastic programming sample average approximation (SAA) for approximating the solution to this problem. One SAA framework is new and uses leave-one-out residuals for scenario generation. The frameworks we investigate are flexible and accommodate parametric, nonparametric, and semiparametric regression techniques. We derive conditions on the data generation process, the prediction model, and the stochastic program under which solutions of these data-driven SAAs are consistent and asymptotically optimal, and also derive finite sample guarantees. Computational experiments validate our theoretical results, demonstrate examples where our datadriven formulations have advantages over existing approaches (even if the prediction model is misspecified), and illustrate the benefits of our data-driven formulations in the limited data regime.

Description

Keywords

data-driven stochastic programming, covariates, regression, sample average approximation, jackknife, large deviations

Citation