Toward Transformer-based Large Energy Models for Smart Energy Management
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Buildings contribute significantly to global energy demand and emissions, highlighting the need for precise energy forecasting for effective management. Existing research tends to focus on specific target problems, such as individual buildings or small groups of buildings, leading to current challenges in data-driven energy forecasting, including dependence on data quality and quantity, limited generalizability, and computational inefficiency. To address these challenges, Generalized Energy Models (GEMs) for energy forecasting can potentially be developed using large-scale datasets. Transformer architectures, known for their scalability, ability to capture long-term dependencies, and efficiency in parallel processing of large datasets, are considered good candidates for GEMs. In this study, we tested the hypothesis that GEMs can be efficiently developed to outperform in-situ models trained on individual buildings. To this end, we investigated and compared three candidate multi-variate Transformer architectures, utilizing both zero-shot and fine-tuning strategies, with data from 1,014 buildings. The results, evaluated across three prediction horizons (24, 72, and 168 hours), confirm that GEMs significantly outperform Transformer-based in-situ (i.e., building-specific) models. Fine-tuned GEMs showed performance improvements of up to 28% and reduced training time by 55%. Besides Transformer-based in-situ models, GEMs outperformed several state-of-the-art non-Transformer deep learning baseline models in efficiency and efficiency. We further explored the answer to a number of questions including the required data size for effective fine-tuning, as well as the impact of input sub-sequence length and pre-training dataset size on GEM performance. The findings show a significant performance boost by using larger pre-training datasets, highlighting the potential for larger GEMs using web-scale global data to move toward Large Energy Models (LEM).