Browsing by Author "Chen, Xiaoyu"
Now showing 1 - 7 of 7
Results Per Page
Sort Options
- Distributed Data Filtering and Modeling for Fog and Networked ManufacturingLi, Yifu; Wang, Lening; Chen, Xiaoyu; Jin, Ran (2023-04-05)
- Ensemble Active Learning by Contextual Bandits for AI Incubation in ManufacturingZeng, Yingyan; Chen, Xiaoyu; Jin, Ran (2023-02)The online sensing techniques and computational resources in an Industrial Cyber-physical System (ICPS) provide a digital foundation for data-driven decision making by artificial intelligence (AI) models. However, the poor data quality (e.g., inconsistent distribution, imbalanced classes) of high-speed, large-volume data streams poses significant challenges to the online deployment of the offline trained AI models. As an alternative, updating AI models online based on streaming data enables continuous improvement and resilient modeling performance. However, for a supervised learning model (i.e., a base learner), it is labor-intensive to continuously annotate all streaming samples and it is also challenging to select a subset with good quality to update the model. Hence, a data acquisition method is needed to select the data for annotation from streaming data to ensure data quality while saving annotation efforts. In the literature, active learning methods have been proposed to acquire informative samples. Different acquisition criteria were developed for exploration of under-represented regions in the input variable space or exploitation of the well-represented regions for optimal estimation of base learners. However, it remains a challenge to balance the exploration-exploitation trade-off under different online annotation scenarios. On the other hand, an acquisition criterion learned by AI (e.g., by reinforcement learning) adapts itself to a scenario dynamically, but the ambiguous consideration of the trade-off limits its performance in frequently changing manufacturing contexts. To overcome these limitations, we propose an ensemble active learning method by contextual bandits (CbeAL). CbeAL incorporates a set of active learning agents (i.e., acquisition criteria) explicitly designed for exploration or exploitation by a weighted combination of their acquisition decisions. The weight of each agent will be dynamically adjusted based on the usefulness of its decisions to improve the performance of the base learner. With adaptive and explicit consideration of both objectives, CbeAL efficiently guides the data acquisition process through selecting informative samples to reduce the human annotation efforts. Furthermore, we characterize the exploration and exploitation capability of the proposed agents theoretically. The evaluation results in a numerical simulation study and a real case study demonstrate the effectiveness and efficiency of CbeAL in manufacturing process modeling of the ICPS.
- Ensemble Active Learning by Contextual Bandits for AI Incubation in ManufacturingZeng, Yingyan; Chen, Xiaoyu; Jin, Ran (ACM, 2023-10)An Industrial Cyber-physical System (ICPS) provide a digital foundation for data-driven decision-making by artificial intelligence (AI) models. However, the poor data quality (e.g., inconsistent distribution, imbalanced classes) of high-speed, large-volume data streams poses significant challenges to the online deployment of offline-trained AI models. As an alternative, updating AI models online based on streaming data enables continuous improvement and resilient modeling performance. However, for a supervised learning model (i.e., a base learner), it is labor-intensive to annotate all streaming samples to update the model. Hence, a data acquisition method is needed to select the data for annotation to ensure data quality while saving annotation efforts. In the literature, active learning methods have been proposed to acquire informative samples. Different acquisition criteria were developed for exploration of under-represented regions in the input variable space or exploitation of the well-represented regions for optimal estimation of base learners. However, it remains a challenge to balance the exploration-exploitation trade-off under different online annotation scenarios. On the other hand, an acquisition criterion learned by AI adapts itself to a scenario dynamically, but the ambiguous consideration of the trade-off limits its performance in frequently changing manufacturing contexts. To overcome these limitations, we propose an ensemble active learning method by contextual bandits (CbeAL). CbeAL incorporates a set of active learning agents (i.e., acquisition criteria) explicitly designed for exploration or exploitation by a weighted combination of their acquisition decisions. The weight of each agent will be dynamically adjusted based on the usefulness of its decisions to improve the performance of the base learner. With adaptive and explicit consideration of both objectives, CbeAL efficiently guides the data acquisition process by selecting informative samples to reduce the human annotation efforts. Furthermore, we characterize the exploration and exploitation capability of the proposed agents theoretically. The evaluation results in a numerical simulation study and a real case study demonstrate the effectiveness and efficiency of CbeAL in manufacturing process modeling of the ICPS.
- Hybrid Summarization of Dakota Access Pipeline Protests (NoDAPL)Chen, Xiaoyu; Wang, Haitao; Mehrotra, Maanav; Chhikara, Naman; Sun, Di (Virginia Tech, 2018-12-14)Dakota Access Pipeline Protests (known with the hashtag #NoDAPL) are grassroots movements that began in April 2016 in reaction to the approved construction of Energy Transfer Partners’ Dakota Access Pipeline in the northern United States. The NoDAPL movements produce many FaceBook messages, tweets, blogs, and news, which reflect different aspects of the NoDAPL events. The related information keeps increasing rapidly, which makes it difficult to understand the events in an efficient manner. Therefore, it is invaluable to automatically or at least semi-automatically generate short summaries based on the online available big data. Motivated by this automatic summarization need, the objective of this project is to propose a novel automatic summarization approach to efficiently and effectively summarize the topics hidden in the online big text data. Although automatic summarization has been investigated for more than 60 years since the publication of Luhn’s 1958 seminal paper, several challenges exist in summarizing online big text sets, such as large proportion of noise texts, highly redundant information, multiple latent topics, etc. Therefore, we propose an automatic framework with minimal human efforts to summarize big online text sets (~11,000 documents on NoDAPL) according to latent topics with nonrelevant information removed. This framework provides a hybrid model to combine the advantages of latent Dirichlet allocation (LDA) based extractive and deep-learning based abstractive methods. Different from semi-automatic summarization approaches such as template-based summarization, the proposed method does not require a deep understanding of the events from the practitioners to create the template nor to fill in the template by using regular expressions. During the procedure, the only human effort needed is to manually label a few (say, 100) documents as relevant and irrelevant. We evaluate the quality of the generated automatic summary with both extrinsic and intrinsic measurement. In the extrinsic subjective evaluation, we design a set of guideline questions and conduct a task-based measurement. Results show that 91.3% of sentences are within the scope of the guideline, and 69.6% of the outlined questions can be answered by reading the generated summary. The intrinsic ROUGE measurements show our entity coverage is a total of 2.6% and ROUGE L and ROUGE SU4 scores are 0.148 and 0.065. Overall, the proposed hybrid model achieves decent performance on summarizing NoDAPL events. Future work includes testing of the approach with more textual datasets for interesting topics, and investigation of topic modeling-supervised classification approach to minimize human efforts in automatic summarization. Besides, we also would like to investigate a deep learning-based recommender system for better sentence re-ranking.
- Improving Assessment in Kidney Transplantation by Multitask General Path ModelLan, Qing; Chen, Xiaoyu; Li, Murong; Robertson, John; Lei, Yong; Jin, Ran (2023)Kidney transplantation helps end-stage patients regain health and quality-of-life. The decisions for matching donor kidneys and recipients affect success of transplantation. However, current kidney matching decision procedures do not consider viability loss during preservation. The objective here is to forecast heterogeneous kidney viability, based on historical datasets to support kidney matching decision-making. Six recently procured porcine kidneys were used to conduct viability assessment experiments to validate the proposed multitask general path model. The model forecasts kidney viability by transferring knowledge from learning the commonality of all kidneys and the heterogeneity of each kidney. The proposed model provides exactly accurate kidney viability forecasting results compared to the state-of-the-art models including a multitask learning model, a general path model, and a general linear model. The proposed model provides satisfactory kidney viability forecasting accuracy because it quantifies the degradation information from trajectory of a viability loss path. It transfers knowledge of common effects from all kidneys and identifies individual effects of each kidney. This method can be readily extended to other decision-making scenarios in kidney transplantation to improve overall assessment performance. For example, analytical generalizations gained by modeling have been validated based on needle biopsy data targeting the improvement of tissue extraction accuracy. The proposed model applied in multiple kidney assessment processes in transplantation can potentially reduce the kidney discard rate by providing effective kidney matching decisions. Thus, the increased kidney utilization rate will benefit more patients and prolong their lives.
- INN: An Interpretable Neural Network for AI Incubation in ManufacturingChen, Xiaoyu; Zeng, Yingyan; Kang, Sungku; Jin, Ran (ACM, 2022-06-21)Both artificial intelligence (AI) and domain knowledge from human experts play an important role in manufacturing decision-making. While smart manufacturing emphasizes a fully automated data-driven decision-making, the AI incubation process involves human experts to enhance AI systems by integrating domain knowledge for modeling, data collection and annotation, and feature extraction. Such an AI incubation process will not only enhance the domain knowledge discovery, but also improve the interpretability and trustworthiness of AI methods. In this paper, we focus on the knowledge transfer from human experts to a supervised learning problem by learning domain knowledge as interpretable features and rules, which can be used to construct rule-based systems to support manufacturing decision-making, such as process modeling and quality inspection. Although many advanced statistical and machine learning methods have shown promising modeling accuracy and efficiency, rule-based systems are still highly preferred and widely adopted due to their interpretability for human experts to comprehend. However, most of the existing rule-based systems are constructed based on deterministic human-crafted rules, whose parameters, e.g., thresholds of decision rules, are suboptimal. On the other hand, the machine learning methods, such as tree models or neural networks, can learn a decision-rule based structure without much interpretation or agreement with domain knowledge. Therefore, the traditional machine learning models and human experts' domain knowledge cannot be directly improved by learning from data. In this research, we propose an interpretable neural network (INN) model with a center-adjustable Sigmoid activation function to efficiently optimize the rule-based systems. Using the rule-based system from domain knowledge to regulate the INN architecture will not only improve the prediction accuracy with optimized parameters, but also ensure the interpretability by adopting the interpretable rule-based systems from domain knowledge. The proposed INN will be effective for supervised learning problems when rule-based systems are available. The merits of INN model are demonstrated via a simulation study and a real case study in the quality modeling of a semiconductor manufacturing process. The source code of this paper is hosted here https://github.com/XiaoyuChenUofL/Interpretable-Neural-Network.
- Multiscale Quantitative Analytics of Human Visual Searching TasksChen, Xiaoyu (Virginia Tech, 2021-07-16)Benefit from the recent advancements of artificial intelligence (AI) methods, industrial automation has replaced human labors in many tasks. However, humans are still placed in the central role when visual searching tasks are highly involved for manufacturing decision-making. For example, highly customized products fabricated by additive manufacturing processes have posed significant challenges to AI methods in terms of their performance and generalizability. As a result, in practice, human visual searching tasks are still widely involved in manufacturing contexts (e.g., human resource management, quality inspection, etc.) based on various visualization techniques. Quantitatively modeling the visual searching behaviors and performance will not only contribute to the understanding of decision-making process in a visualization system, but also advance AI methods by incubating them with human expertise. In general, visual searching can be quantitatively understood from multiple scales, namely, 1) the population scale to treat individuals equally and model the general relationship between individual's physiological signals with visual searching decisions; 2) the individual scale to model the relationship between individual differences and visual searching decisions; and 3) the attention scale to model the relationship between individuals' attention in visual searching and visual searching decisions. The advancements of wearable sensing techniques enable such multiscale quantitative analytics of human visual searching performance. For example, by equipping human users with electroencephalogram (EEG) device, eye tracker, and logging system, the multiscale quantitative relationships among human physiological signals, behaviors and performance can be readily established. This dissertation attempts to quantify visual searching process from multiple scales by proposing (1) a data-fusion method to model the quantitative relationship between physiological signals and human's perceived task complexities (population scale, Chapter 2); (2) a recommender system to quantify and decompose the individual differences into explicit and implicit differences via personalized recommender system-based sensor analytics (individual scale, Chapter 3); and (3) a visual language processing modeling framework to identify and correlate visual cues (i.e., identified from fixations) with humans' quality inspection decisions in human visual searching tasks (attention scale, Chapter 4). Finally, Chapter 5 summarizes the contributions and proposes future research directions. The proposed methodologies can be readily extended to other applications and research studies to support multi-scale quantitative analytics. Besides, the quantitative understanding of human visual searching behaviors performance can also generate insights to further incubate AI methods with human expertise. Merits of the proposed methodologies are demonstrated in a visualization evaluation user study, and a cognitive hacking user study. Detailed notes to guide the implementation and deployment are provided for practitioners and researchers in each chapter.