The Role of EDA in Developing Robust Machine Learning Models for Lithology and Penetration Rate Prediction from MWD Data
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Measure-While-Drilling (MWD) data provide real-time insight into subsurface conditions and drilling performance, yet their complexity and operational noise often hinder reliable modeling. This study demonstrates the role of Exploratory Data Analysis (EDA) in developing robust machine learning (ML) models for lithology classification and penetration rate (PR) prediction in mining operations. A structured EDA workflow—comprising data integrity assessment, feature distribution analysis, correlation mapping, and depth-wise parameter profiling—was implemented to identify redundant attributes, isolate non-productive intervals, and enhance dataset consistency. Through EDA-informed normalization and feature selection, data consistency and model performance were significantly improved. Machine learning algorithms, including Decision Tree, Random Forest, and Multi-Layer Perceptron, were trained on the refined dataset. The Random Forest Classifier achieved 98.45% accuracy in lithology prediction, while the Random Forest Regressor produced the most accurate PR estimation (R2 = 0.83, RMSE = 0.52). These results highlight EDA as a critical foundation for constructing physics-informed, data-driven models that enhance predictive reliability and operational efficiency in mining environments.