Machine Learning and Multivariate Statistics for Optimizing Bioprocessing and Polyolefin Manufacturing

Agarwal, Aman

Machine Learning and Multivariate Statistics for Optimizing Bioprocessing and Polyolefin Manufacturing

dc.contributor.author	Agarwal, Aman	en
dc.contributor.committeechair	Liu, Yih-An	en
dc.contributor.committeemember	Xin, Hongliang	en
dc.contributor.committeemember	Baird, Donald G.	en
dc.contributor.committeemember	Deshmukh, Sanket A.	en
dc.contributor.department	Chemical Engineering	en
dc.date.accessioned	2022-01-08T09:00:10Z	en
dc.date.available	2022-01-08T09:00:10Z	en
dc.date.issued	2022-01-07	en
dc.description.abstract	Chemical engineers have routinely used computational tools for modeling, optimizing, and debottlenecking chemical processes. Because of the advances in computational science over the past decade, multivariate statistics and machine learning have become an integral part of the computerization of chemical processes. In this research, we look into using multivariate statistics, machine learning tools, and their combinations through a series of case studies including a case with a successful industrial deployment of machine learning models for fermentation. We use both commercially-available software tools, Aspen ProMV and Python, to demonstrate the feasibility of the computational tools. This work demonstrates a novel application of ensemble-based machine learning methods in bioprocessing, particularly for the prediction of different fermenter types in a fermentation process (to allow for successful data integration) and the prediction of the onset of foaming. We apply two ensemble frameworks, Extreme Gradient Boosting (XGBoost) and Random Forest (RF), to build classification and regression models. Excessive foaming can interfere with the mixing of reactants and lead to problems, such as decreasing effective reactor volume, microbial contamination, product loss, and increased reaction time. Physical modeling of foaming is an arduous process as it requires estimation of foam height, which is dynamic in nature and varies for different processes. In addition to foaming prediction, we extend our work to control and prevent foaming by allowing data-driven ad hoc addition of antifoam using exhaust differential pressure as an indicator of foaming. We use large-scale real fermentation data for six different types of sporulating microorganisms to predict foaming over multiple strains of microorganisms and build exploratory time-series driven antifoam profiles for four different fermenter types. In order to successfully predict the antifoam addition from the large-scale multivariate dataset (about half a million instances for 163 batches), we use TPOT (Tree-based Pipeline Optimization Tool), an automated genetic programming algorithm, to find the best pipeline from 600 other pipelines. Our antifoam profiles are able to decrease hourly volume retention by over 53% for a specific fermenter. A decrease in hourly volume retention leads to an increase in fermentation product yield. We also study two different cases associated with the manufacturing of polyolefins, particularly LDPE (low-density polyethylene) and HDPE (high-density polyethylene). Through these cases, we showcase the usage of machine learning and multivariate statistical tools to improve process understanding and enhance the predictive capability for process optimization. By using indirect measurements such as temperature profiles, we demonstrate the viability of such measures in the prediction of polyolefin quality parameters, anomaly detection, and statistical monitoring and control of the chemical processes associated with a LDPE plant. We use dimensionality reduction, visualization tools, and regression analysis to achieve our goals. Using advanced analytical tools and a combination of algorithms such as PCA (Principal Component Analysis), PLS (Partial Least Squares), Random Forest, etc., we identify predictive models that can be used to create inferential schemes. Soft-sensors are widely used for on-line monitoring and real-time prediction of process variables. In one of our cases, we use advanced machine learning algorithms to predict the polymer melt index, which is crucial in determining the product quality of polymers. We use real industrial data from one of the leading chemical engineering companies in the Asia-Pacific region to build a predictive model for a HDPE plant. Lastly, we show an end-to-end workflow for deep learning on both industrial and simulated polyolefin datasets. Thus, using these five cases, we explore the usage of advanced machine learning and multivariate statistical techniques in the optimization of chemical and biochemical processes. The recent advances in computational hardware allow engineers to design such data-driven models, which enhances their capacity to effectively and efficiently monitor and control a process. We showcase that even non-expert chemical engineers can implement such machine learning algorithms with ease using open-source or commercially available software tools.	en
dc.description.abstractgeneral	Most chemical and biochemical processes are equipped with advanced probes and connectivity sensors that collect large amounts of data on a daily basis. It is critical to manage and utilize the significant amount of data collected from the start and throughout the development and manufacturing cycle. Chemical engineers have routinely used computational tools for modeling, designing, optimizing, debottlenecking, and troubleshooting chemical processes. Herein, we present different applications of machine learning and multivariate statistics using industrial datasets. This dissertation also includes a deployed industrial solution to mitigate foaming in commercial fermentation reactors as a proof-of-concept (PoC). Our antifoam profiles are able to decrease volume loss by over 53% for a specific fermenter. Throughout this dissertation, we demonstrate applications of several techniques like ensemble methods, automated machine learning, exploratory time series, and deep learning for solving industrial problems. Our aim is to bridge the gap from industrial data acquisition to finding meaningful insights for process optimization.	en
dc.description.degree	Doctor of Philosophy	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:33732	en
dc.identifier.uri	http://hdl.handle.net/10919/107480	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	foaming	en
dc.subject	antifoam profiles	en
dc.subject	fermentation	en
dc.subject	Machine learning	en
dc.subject	multivariate statistics	en
dc.subject	ensemble methods	en
dc.subject	automated Machine learning	en
dc.subject	deep learning	en
dc.title	Machine Learning and Multivariate Statistics for Optimizing Bioprocessing and Polyolefin Manufacturing	en
dc.type	Dissertation	en
thesis.degree.discipline	Chemical Engineering	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Doctor of Philosophy	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Agarwal_A_D_2022.pdf
Size:: 3.88 MB
Format:: Adobe Portable Document Format

Download

Collections

Doctoral Dissertations