Transformer Networks for Smart Cities: Framework and Application to Makassar Smart Garden Alleys
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Many countries around the world are undergoing massive urbanization campaigns at an unprecedented rate, heralded by promises of economical prosperity and bolstered population health and well-being. Projections indicate that by 2050, nearly 68% of the world populace will reside in these urban environments. However, rapid growth at such an exceptional scale poses unique challenges pertaining to environmental quality and food production, which can negate the effectiveness of the aforementioned boons. As such, there is an emphasis on mitigating these negative effects through the construction of smart and connected communities (S&CC), which integrate both artificial intelligence (AI) and the Internet of Things (IoT). This coupling of intelligent technologies also poses interesting system design challenges pertaining to the fusion of the diverse, heterogeneous datasets available to IoT environments, and the ability to learn multiple S&CC problem sets concurrently. Attention-based Transformer networks are of particular interest given their success across diverse fields of natural language processing (NLP), computer vision, time-series regression, and multi-modal data fusion in recent years. This begs the question whether Transformers can be further diversified to leverage fusions of IoT data sources for heterogeneous multi-task learning in S&CC trade spaces. This is a fundamental question that this thesis seeks to answer. Indeed, the key contribution of this thesis is the design and application of Transformer networks for developing AI systems in emerging smart cities. This is executed within a collaborative U.S.-Indonesia effort between Virginia Tech, the University of Colorado Boulder, the Universitas Gadjah Mada, and the Institut Teknologi Bandung with the goal of growing smart and sustainable garden alleys in Makassar City, Indonesia. Specifically, a proof-of-concept AI nerve-center is proposed using a backbone of pure-encoder Transformer architectures to learn a diverse set of tasks such as multivariate time-series regression, visual plant disease classification, and image-time-series fusion. To facilitate the data fusion tasks, an effective algorithm is also proposed to synthesize heterogeneous feature sets, such as multivariate time-series and time-correlated images. Moreover, a hyperparameter tuning framework is also proposed to standardize and automate model training regimes. Extensive experimentation shows that the proposed Transformer-based systems can handle various input data types via custom sequence embedding techniques, and are naturally suited to learning a diverse set of tasks. Further, the results also show that multi-task learners increase both memory and computational efficiency while maintaining comparable performance to both single-task variants, and non-Transformer baselines. This demonstrates the flexibility of Transformer networks to learn from a fusion of IoT data sources, their applicability in S&CC trade spaces, and their further potential for deployment on edge computing devices.