An Optimal Constrained Pruning Strategy for Decision Trees
This paper is concerned with the optimal constrained pruning of decision trees. We present a novel 0-1 programming model for pruning the tree to minimize some general penalty function based on the resulting leaf nodes, and show that this model possesses a totally unimodular structure that enables it to be solved as a shortest-path problem on an acyclic graph. Moreover, we prove that this problem can be solved in strongly polynomial time while incorporating an additional constraint on the number of residual leaf nodes. Furthermore, the framework of the proposed modeling approach renders it suitable to accommodate different (multiple) objective functions and side-constraints, and we identify various such modeling options that can be applied in practice. The developed methodology is illustrated using a numerical example to provide insights, and some computational results are presented to demonstrate the efficacy of solving generically constrained problems of this type. We also apply this technique to a large-scale transportation analysis and simulation system (TRANSIMS), and present related computational results using real data to exhibit the flexibility and effectiveness of the proposed approach.