Total Organic Carbon and Clay Estimation in Shale Reservoirs Using Automatic Machine Learning


TR Number




Journal Title

Journal ISSN

Volume Title


Virginia Tech


High total organic carbon (TOC) and low clay content are two criteria to identify the "sweet spots" in shale gas plays. Recently, machine learning has been proved to be effective to estimate TOC and clay from well loggings. The remaining questions are what algorithm we should choose in the first place and whether we can improve the already built models. Automatic machine learning (AutoML) appears as a promising tool to solve those realistic questions by training multiple models and compares them automatically. Two wells with conventional well loggings and elemental capture spectroscopy are selected from a shale gas play to test the AutoML's ability in TOC and clay estimation. TOC and clay content are extracted from the Schlumberger's ELAN interpretation and calibrated to cores. Generalizability is proved in the blind test well and the mean absolute test errors for TOC and clay estimation are 0.23% and 3.77%. 829 data points are used to generate the final models with the train-test ratio of 75:25. The mean absolute test errors are 0.26% and 2.68% for TOC and clay, respectively, which are very low for TOC ranging from 0-6% and clay from 35-65%. The results show the AutoML's success and efficiency in the estimation. The trained models are interpreted to understand the variables effects in predictions. 235 wells are selected through data quality checking and feed into the models to create TOC and clay distribution maps. The maps provide guidance on where to drill a new well for higher shale gas production.



Shale gas, Total organic carbon, Clay content, Sweet spots, Automatic machine learning, Ensemble learning