Machine Learning and Multivariate Statistics for Optimizing Bioprocessing and Polyolefin Manufacturing

dc.contributor.authorAgarwal, Amanen
dc.contributor.committeechairLiu, Yih-Anen
dc.contributor.committeememberXin, Hongliangen
dc.contributor.committeememberBaird, Donald G.en
dc.contributor.committeememberDeshmukh, Sanket A.en
dc.contributor.departmentChemical Engineeringen
dc.description.abstractChemical engineers have routinely used computational tools for modeling, optimizing, and debottlenecking chemical processes. Because of the advances in computational science over the past decade, multivariate statistics and machine learning have become an integral part of the computerization of chemical processes. In this research, we look into using multivariate statistics, machine learning tools, and their combinations through a series of case studies including a case with a successful industrial deployment of machine learning models for fermentation. We use both commercially-available software tools, Aspen ProMV and Python, to demonstrate the feasibility of the computational tools. This work demonstrates a novel application of ensemble-based machine learning methods in bioprocessing, particularly for the prediction of different fermenter types in a fermentation process (to allow for successful data integration) and the prediction of the onset of foaming. We apply two ensemble frameworks, Extreme Gradient Boosting (XGBoost) and Random Forest (RF), to build classification and regression models. Excessive foaming can interfere with the mixing of reactants and lead to problems, such as decreasing effective reactor volume, microbial contamination, product loss, and increased reaction time. Physical modeling of foaming is an arduous process as it requires estimation of foam height, which is dynamic in nature and varies for different processes. In addition to foaming prediction, we extend our work to control and prevent foaming by allowing data-driven ad hoc addition of antifoam using exhaust differential pressure as an indicator of foaming. We use large-scale real fermentation data for six different types of sporulating microorganisms to predict foaming over multiple strains of microorganisms and build exploratory time-series driven antifoam profiles for four different fermenter types. In order to successfully predict the antifoam addition from the large-scale multivariate dataset (about half a million instances for 163 batches), we use TPOT (Tree-based Pipeline Optimization Tool), an automated genetic programming algorithm, to find the best pipeline from 600 other pipelines. Our antifoam profiles are able to decrease hourly volume retention by over 53% for a specific fermenter. A decrease in hourly volume retention leads to an increase in fermentation product yield. We also study two different cases associated with the manufacturing of polyolefins, particularly LDPE (low-density polyethylene) and HDPE (high-density polyethylene). Through these cases, we showcase the usage of machine learning and multivariate statistical tools to improve process understanding and enhance the predictive capability for process optimization. By using indirect measurements such as temperature profiles, we demonstrate the viability of such measures in the prediction of polyolefin quality parameters, anomaly detection, and statistical monitoring and control of the chemical processes associated with a LDPE plant. We use dimensionality reduction, visualization tools, and regression analysis to achieve our goals. Using advanced analytical tools and a combination of algorithms such as PCA (Principal Component Analysis), PLS (Partial Least Squares), Random Forest, etc., we identify predictive models that can be used to create inferential schemes. Soft-sensors are widely used for on-line monitoring and real-time prediction of process variables. In one of our cases, we use advanced machine learning algorithms to predict the polymer melt index, which is crucial in determining the product quality of polymers. We use real industrial data from one of the leading chemical engineering companies in the Asia-Pacific region to build a predictive model for a HDPE plant. Lastly, we show an end-to-end workflow for deep learning on both industrial and simulated polyolefin datasets. Thus, using these five cases, we explore the usage of advanced machine learning and multivariate statistical techniques in the optimization of chemical and biochemical processes. The recent advances in computational hardware allow engineers to design such data-driven models, which enhances their capacity to effectively and efficiently monitor and control a process. We showcase that even non-expert chemical engineers can implement such machine learning algorithms with ease using open-source or commercially available software tools.en
dc.description.abstractgeneralMost chemical and biochemical processes are equipped with advanced probes and connectivity sensors that collect large amounts of data on a daily basis. It is critical to manage and utilize the significant amount of data collected from the start and throughout the development and manufacturing cycle. Chemical engineers have routinely used computational tools for modeling, designing, optimizing, debottlenecking, and troubleshooting chemical processes. Herein, we present different applications of machine learning and multivariate statistics using industrial datasets. This dissertation also includes a deployed industrial solution to mitigate foaming in commercial fermentation reactors as a proof-of-concept (PoC). Our antifoam profiles are able to decrease volume loss by over 53% for a specific fermenter. Throughout this dissertation, we demonstrate applications of several techniques like ensemble methods, automated machine learning, exploratory time series, and deep learning for solving industrial problems. Our aim is to bridge the gap from industrial data acquisition to finding meaningful insights for process optimization.en
dc.description.degreeDoctor of Philosophyen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.subjectantifoam profilesen
dc.subjectMachine learningen
dc.subjectmultivariate statisticsen
dc.subjectensemble methodsen
dc.subjectautomated Machine learningen
dc.subjectdeep learningen
dc.titleMachine Learning and Multivariate Statistics for Optimizing Bioprocessing and Polyolefin Manufacturingen
dc.typeDissertationen Engineeringen Polytechnic Institute and State Universityen of Philosophyen
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
3.88 MB
Adobe Portable Document Format