Browsing by Author "Chen, Hongjie"
Now showing 1 - 4 of 4
Results Per Page
Sort Options
- Graph Deep Factors for Probabilistic Time-series ForecastingChen, Hongjie; Rossi, Ryan; Kim, Sungchul; Mahadik, Kanak; Eldardiry, Hoda (ACM, 2022)Deep probabilistic forecasting techniques can model large collections of time-series. However, recent techniques explicitly assume either complete independence (local model) or complete dependence (global model) between time-series in the collection. This corresponds to the two extreme cases where every time-series is disconnected from every other time-series or likewise, that every time-series is related to every other time-series resulting in a completely connected graph. In this work, we propose a deep hybrid probabilistic graph-based forecasting framework called Graph Deep Factors (GraphDF) that goes beyond these two extremes by allowing nodes and their time-series to be connected to others in an arbitrary fashion. GraphDF is a hybrid forecasting framework that consists of a relational global and relational local model. In particular, a relational global model learns complex non-linear time-series patterns globally using the structure of the graph to improve both forecasting accuracy and computational efficiency. Similarly, instead of modeling every time-series independently, a relational local model not only considers its individual time-series but also the time-series of nodes that are connected in the graph. The experiments demonstrate the effectiveness of the proposed deep hybrid graph-based forecasting model compared to the state-of-the-art methods in terms of its forecasting accuracy, runtime, and scalability. Our case study reveals that GraphDF can successfully generate cloud usage forecasts and opportunistically schedule workloads to increase cloud cluster utilization by 47.5% on average. Furthermore, we target addressing the common nature of many time-series forecasting applications where time-series are provided in a streaming version, however, most methods fail to leverage the newly incoming time-series values and result in worse performance over time. In this paper, we propose an online incremental learning framework for probabilistic forecasting. The framework is theoretically proven to have lower time and space complexity. The framework can be universally applied to many other machine learning-based methods.
- Graph Time-series Modeling in Deep Learning: A SurveyChen, Hongjie; Eldardiry, Hoda (ACM, 2024)Time-series and graphs have been extensively studied for their ubiquitous existence in numerous domains. Both topics have been separately explored in the field of deep learning. For time-series modeling, recurrent neural networks or convolutional neural networks model the relations between values across time steps, while for graph modeling, graph neural networks model the inter-relations between nodes. Recent research in deep learning requires simultaneous modeling for time-series and graphs when both representations are present. For example, both types of modeling are necessary for time-series classification, regression, and anomaly detection in graphs. This paper aims to provide a comprehensive summary of these models, which we call graph time-series models. To the best of our knowledge, this is the first survey paper that provides a picture of related models from the perspective of deep graph time-series modeling to address a range of time-series tasks, including regression, classification, and anomaly detection. Graph time-series models are split into two categories, a) graph recurrent/convolutional neural networks and b) graph attention neural networks. Under each category, we further categorize models based on their properties. Additionally, we compare representative models and discuss how distinctive model characteristics are utilized with respect to various model components and data challenges. Pointers to commonly used datasets and code are included to facilitate access for further research. In the end, we discuss potential directions for future research.
- Graph-based Time-series Forecasting in Deep LearningChen, Hongjie (Virginia Tech, 2024-04-02)Time-series forecasting has long been studied and remains an important research task. In scenarios where multiple time series need to be forecast, approaches that exploit the mutual impact between time series results in more accurate forecasts. This has been demonstrated in various applications, including demand forecasting and traffic forecasting, among others. Hence, this dissertation focuses on graph-based models, which leverage the internode relations to forecast more efficiently and effectively by associating time series with nodes. This dissertation begins by introducing the notion of graph time-series models in a comprehensive survey of related models. The main contributions of this survey are: (1) A novel categorization is proposed to thoroughly analyze over 20 representative graph time-series models from various perspectives, including temporal components, propagation procedures, and graph construction methods, among others. (2) Similarities and differences among models are discussed to provide a fundamental understanding of decisive factors in graph time-series models. Model challenges and future directions are also discussed. Following the survey, this dissertation develops graph time-series models that utilize complex time-series interactions to yield context-aware, real-time, and probabilistic forecasting. The first method, Context Integrated Graph Neural Network (CIGNN), targets resource forecasting with contextual data. Previous solutions either neglect contextual data or only leverage static features, which fail to exploit contextual information. Its main contributions include: (1) Integrating multiple contextual graphs; and (2) Introducing and incorporating temporal, spatial, relational, and contextual dependencies; The second method, Evolving Super Graph Neural Network (ESGNN), targets large-scale time-series datasets through training on super graphs. Most graph time-series models let each node associate with a time series, potentially resulting in a high time cost. Its main contributions include: (1) Generating multiple super graphs to reflect node dynamics at different periods; and (2) Proposing an efficient super graph construction method based on K-Means and LSH; The third method, Probabilistic Hypergraph Recurrent Neural Network (PHRNN), targets datasets under the assumption that nodes interact in a simultaneous broadcasting manner. Previous hypergraph approaches leverage a static weight hypergraph, which fails to capture the interaction dynamics among nodes. Its main contributions include: (1) Learning a probabilistic hypergraph structure from the time series; and (2) Proposing the use of a KNN hypergraph for hypergraph initialization and regularization. The last method, Graph Deep Factors (GraphDF), aims at efficient and effective probabilistic forecasting. Previous probabilistic approaches neglect the interrelations between time series. Its main contributions include: (1) Proposing a framework that consists of a relational global component and a relational local component; (2) Conducting analysis in terms of accuracy, efficiency, scalability, and simulation with opportunistic scheduling. (3) Designing an algorithm for incremental online learning.
- Multiple Myeloma DREAM Challenge reveals epigenetic regulator PHF19 as marker of aggressive diseaseMason, Mike J.; Schinke, Carolina; Eng, Christine L. P.; Towfic, Fadi; Gruber, Fred; Dervan, Andrew; White, Brian S.; Pratapa, Aditya; Guan, Yuanfang; Chen, Hongjie; Cui, Yi; Li, Bailiang; Yu, Thomas; Neto, Elias Chaibub; Mavrommatis, Konstantinos; Ortiz, Maria; Lyzogubov, Valeriy; Bisht, Kamlesh; Dai, Hongyue Y.; Schmitz, Frank; Flynt, Erin; Rozelle, Dan; Danziger, Samuel A.; Ratushny, Alexander; Dalton, William S.; Goldschmidt, Hartmut; Avet-Loiseau, Herve; Samur, Mehmet; Hayete, Boris; Sonneveld, Pieter; Shain, Kenneth H.; Munshi, Nikhil; Auclair, Daniel; Hose, Dirk; Morgan, Gareth; Trotter, Matthew; Bassett, Douglas; Goke, Jonathan; Walker, Brian A.; Thakurta, Anjan; Guinney, Justin (2020-02-14)While the past decade has seen meaningful improvements in clinical outcomes for multiple myeloma patients, a subset of patients does not benefit from current therapeutics for unclear reasons. Many gene expression-based models of risk have been developed, but each model uses a different combination of genes and often involves assaying many genes making them difficult to implement. We organized the Multiple Myeloma DREAM Challenge, a crowdsourced effort to develop models of rapid progression in newly diagnosed myeloma patients and to benchmark these against previously published models. This effort lead to more robust predictors and found that incorporating specific demographic and clinical features improved gene expression-based models of high risk. Furthermore, post-challenge analysis identified a novel expression-based risk marker, PHF19, which has recently been found to have an important biological role in multiple myeloma. Lastly, we show that a simple four feature predictor composed of age, ISS, and expression of PHF19 and MMSET performs similarly to more complex models with many more gene expression features included.