Scholarly Works, Virginia Tech National Security Institute

Permanent URI for this collection

https://hdl.handle.net/10919/110084

Browse

Now showing 1 - 20 of 24

Adversarial Machine Learning for NextG Covert Communications Using Multiple Antennas
Kim, Brian; Sagduyu, Yalin; Davaslioglu, Kemal; Erpek, Tugba; Ulukus, Sennur (MDPI, 2022-07-29)
This paper studies the privacy of wireless communications from an eavesdropper that employs a deep learning (DL) classifier to detect transmissions of interest. There exists one transmitter that transmits to its receiver in the presence of an eavesdropper. In the meantime, a cooperative jammer (CJ) with multiple antennas transmits carefully crafted adversarial perturbations over the air to fool the eavesdropper into classifying the received superposition of signals as noise. While generating the adversarial perturbation at the CJ, multiple antennas are utilized to improve the attack performance in terms of fooling the eavesdropper. Two main points are considered while exploiting the multiple antennas at the adversary, namely the power allocation among antennas and the utilization of channel diversity. To limit the impact on the bit error rate (BER) at the receiver, the CJ puts an upper bound on the strength of the perturbation signal. Performance results show that this adversarial perturbation causes the eavesdropper to misclassify the received signals as noise with a high probability while increasing the BER at the legitimate receiver only slightly. Furthermore, the adversarial perturbation is shown to become more effective when multiple antennas are utilized.
An Analysis of Radio Frequency Transfer Learning Behavior
Wong, Lauren J.; Muller, Braeden; McPherson, Sean; Michaels, Alan J. (MDPI, 2024-06-03)
Transfer learning (TL) techniques, which leverage prior knowledge gained from data with different distributions to achieve higher performance and reduced training time, are often used in computer vision (CV) and natural language processing (NLP), but have yet to be fully utilized in the field of radio frequency machine learning (RFML). This work systematically evaluates how the training domain and task, characterized by the transmitter (Tx)/receiver (Rx) hardware and channel environment, impact radio frequency (RF) TL performance for example automatic modulation classification (AMC) and specific emitter identification (SEI) use-cases. Through exhaustive experimentation using carefully curated synthetic and captured datasets with varying signal types, channel types, signal to noise ratios (SNRs), carrier/center frequencys (CFs), frequency offsets (FOs), and Tx and Rx devices, actionable and generalized conclusions are drawn regarding how best to use RF TL techniques for domain adaptation and sequential learning. Consistent with trends identified in other modalities, our results show that RF TL performance is highly dependent on the similarity between the source and target domains/tasks, but also on the relative difficulty of the source and target domains/tasks. Results also discuss the impacts of channel environment and hardware variations on RF TL performance and compare RF TL performance using head re-training and model fine-tuning methods.
Assessing the Value of Transfer Learning Metrics for Radio Frequency Domain Adaptation
Wong, Lauren J.; Muller, Braeden P.; McPherson, Sean; Michaels, Alan J. (MDPI, 2024-07-25)
The use of transfer learning (TL) techniques has become common practice in fields such as computer vision (CV) and natural language processing (NLP). Leveraging prior knowledge gained from data with different distributions, TL offers higher performance and reduced training time, but has yet to be fully utilized in applications of machine learning (ML) and deep learning (DL) techniques and applications related to wireless communications, a field loosely termed radio frequency machine learning (RFML). This work examines whether existing transferability metrics, used in other modalities, might be useful in the context of RFML. Results show that the two existing metrics tested, Log Expected Empirical Prediction (LEEP) and Logarithm of Maximum Evidence (LogME), correlate well with post-transfer accuracy and can therefore be used to select source models for radio frequency (RF) domain adaptation and to predict post-transfer accuracy.
CLOUD-D RF: Cloud-based Distributed Radio Frequency Heterogeneous Spectrum Sensing
Green, Dylan; McIrvin, Caleb; Thaboun, River; Wemlinger, Cora; Risi, Joseph; Jones, Alyse; Toubeh, Maymoonah; Headley, William (ACM, 2024-12-04)
In wireless communications, collaborative spectrum sensing is a process that leverages radio frequency (RF) data from multiple RF sensors to make more informed decisions and lower the overall risk of failure in distributed settings. However, most research in collaborative sensing focuses on homogeneous systems using identical sensors, which would not be the case in a real world wireless setting. Instead, due to differences in physical location, each RF sensor would see different versions of signals propagating in the environment, establishing the need for heterogeneous collaborative spectrum sensing. Hence, this paper explores the implementation of collaborative spectrum sensing across heterogeneous sensors, with sensor fusion occurring in the cloud for optimal decision making. We investigate three different machine learning-based fusion methods and test the fused model’s ability to perform modulation classification, with a primary goal of optimizing for network bandwidth in regard to next-generation network applications. Our analysis demonstrates that our fusion process is able to optimize the number of features extracted from the heterogeneous sensors according to their varying performance limitations, simulating adverse conditions in a real-world wireless setting.
Collaborative Multi-Robot Multi-Human Teams in Search and Rescue
Williams, Ryan K.; Abaid, Nicole; McClure, James; Lau, Nathan; Heintzman, Larkin; Hashimoto, Amanda; Wang, Tianzi; Patnayak, Chinmaya; Kumar, Akshay (2022-04-30)
Robots such as unmanned aerial vehicles (UAVs) deployed for search and rescue (SAR) can explore areas where human searchers cannot easily go and gather information on scales that can transform SAR strategy. Multi-UAV teams therefore have the potential to transform SAR by augmenting the capabilities of human teams and providing information that would otherwise be inaccessible. Our research aims to develop new theory and technologies for field deploying autonomous UAVs and managing multi-UAV teams working in concert with multi-human teams for SAR. Specifically, in this paper we summarize our work in progress towards these goals, including: (1) a multi-UAV search path planner that adapts to human behavior; (2) an in-field distributed computing prototype that supports multi-UAV computation and communication; (3) behavioral modeling that yields spatially localized predictions of lost person location; and (4) an interface between human searchers and UAVs that facilitates human-UAV interaction over a wide range of autonomy.
A Combinatorial Approach to Hyperparameter Optimization
Khadka, Krishna; Chandrasekaran, Jaganmohan; Lei, Yu; Kacker, Raghu N.; Kuhn, D. Richard (ACM, 2024-04-14)
In machine learning, hyperparameter optimization (HPO) is essential for effective model training and significantly impacts model performance. Hyperparameters are predefined model settings which fine-tune the model’s behavior and are critical to modeling complex data patterns. Traditional HPO approaches such as Grid Search, Random Search, and Bayesian Optimization have been widely used in this field. However, as datasets grow and models increase in complexity, these approaches often require a significant amount of time and resources for HPO. This research introduces a novel approach using 𝑡-way testing—a combinatorial approach to software testing used for identifying faults with a test set that covers all 𝑡-way interactions—for HPO. 𝑇 -way testing substantially narrows the search space and effectively covers parameter interactions. Our experimental results show that our approach reduces the number of necessary model evaluations and significantly cuts computational expenses while still outperforming traditional HPO approaches for the models studied in our experiments.
Deep-Learning-Based Digitization of Protein-Self-Assembly to Print Biodegradable Physically Unclonable Labels for Device Security
Pradhan, Sayantan; Rajagopala, Abhi D.; Meno, Emma; Adams, Stephen; Elks, Carl R.; Beling, Peter A.; Yadavalli, Vamsi K. (MDPI, 2023-08-28)
The increasingly pervasive problem of counterfeiting affects both individuals and industry. In particular, public health and medical fields face threats to device authenticity and patient privacy, especially in the post-pandemic era. Physical unclonable functions (PUFs) present a modern solution using counterfeit-proof security labels to securely authenticate and identify physical objects. PUFs harness innately entropic information generators to create a unique fingerprint for an authentication protocol. This paper proposes a facile protein self-assembly process as an entropy generator for a unique biological PUF. The posited image digitization process applies a deep learning model to extract a feature vector from the self-assembly image. This is then binarized and debiased to produce a cryptographic key. The NIST SP 800-22 Statistical Test Suite was used to evaluate the randomness of the generated keys, which proved sufficiently stochastic. To facilitate deployment on physical objects, the PUF images were printed on flexible silk-fibroin-based biodegradable labels using functional protein bioinks. Images from the labels were captured using a cellphone camera and referenced against the source image for error rate comparison. The deep-learning-based biological PUF has potential as a low-cost, scalable, highly randomized strategy for anti-counterfeiting technology.
Disappearing cities on US coasts
Ohenhen, Leonard O.; Shirzaei, Manoochehr; Ojha, Chandrakanta; Sherpa, Sonam F.; Nicholls, Robert J. (Nature Research, 2024-03-06)
The sea level along the US coastlines is projected to rise by 0.25–0.3 m by 2050, increasing the probability of more destructive flooding and inundation in major cities. However, these impacts may be exacerbated by coastal subsidence— the sinking of coastal land areas—a factor that is often underrepresented in coastal-management policies and long-term urban planning. In this study, we combine high-resolution vertical land motion (that is, raising or lowering of land) and elevation datasets with projections of sea-level rise to quantify the potential inundated areas in 32 major US coastal cities. Here we show that, even when considering the current coastal-defence structures, further land area of between 1,006 and 1,389 km² is threatened by relative sea-level rise by 2050, posing a threat to a population of 55,000–273,000 people and 31,000–171,000 properties. Our analysis shows that not accounting for spatially variable land subsidence within the cities may lead to inaccurate projections of expected exposure. These potential consequences show the scale of the adaptation challenge, which is not appreciated in most US coastal cities.
Disruptive Role of Vertical Land Motion in Future Assessments of Climate Change-Driven Sea-Level Rise and Coastal Flooding Hazards in the Chesapeake Bay
Sherpa, Sonam Futi; Shirzaei, Manoochehr; Ojha, Chandrakanta (American Geophysical Union, 2023-04)
Future projections of sea-level rise (SLR) used to assess coastal flooding hazards and exposure throughout the 21st century and devise risk mitigation efforts often lack an accurate estimate of coastal vertical land motion (VLM) rate, driven by anthropogenic or non-climate factors in addition to climatic factors. The Chesapeake Bay (CB) region of the United States is experiencing one of the fastest rates of relative sea-level rise on the Atlantic coast of the United States. This study uses a combination of space-borne Interferometric Synthetic Aperture Radar (InSAR), Global Navigation Satellite System (GNSS), Light Detecting and Ranging (LiDAR) data sets, available National Oceanic and Atmospheric Administration (NOAA) long-term tide gauge data, and SLR projections from the Intergovernmental Panel on Climate Change (IPCC), AR6 WG1 to quantify the regional rate of relative SLR and future flooding hazards for the years 2030, 2050, and 2100. By the year 2100, the total inundated areas from SLR and subsidence are projected to be 454(316–549)–600(535𝐴𝐴–690) km² for Shared Socioeconomic Pathways (SSPs) 1–1.9 to 5–8.5, respectively, and 342(132–552)–627(526–735) 𝐴𝐴 km2 only from SLR. The effect of storm surges based on Hurricane Isabel can increase the inundated area to 849(832–867)–1,117(1,054–1,205) km² under different VLM and SLR scenarios. We suggest that accurate estimates of VLM rate, such as those obtained here, are essential to revise IPCC projections and obtain accurate maps of coastal flooding and inundation hazards. The results provided here inform policymakers when assessing hazards associated with global climate changes and local factors in CB, required for developing risk management and disaster resilience plans.
Generative AI tools can enhance climate literacy but must be checked for biases and inaccuracies
Atkins, Carmen; Girgente, Gina; Shirzaei, Manoochehr; Kim, Junghwan (Springer Nature, 2024-04)
In the face of climate change, climate literacy is becoming increasingly important. With wide access to generative AI tools, such as OpenAI’s ChatGPT, we explore the potential of AI platforms for ordinary citizens asking climate literacy questions. Here, we focus on a global scale and collect responses from ChatGPT (GPT-3.5 and GPT-4) on climate change-related hazard prompts over multiple iterations by utilizing the OpenAI’s API and comparing the results with credible hazard risk indices.Wefind a general sense of agreement in comparisons and consistency in ChatGPT over the iterations. GPT-4 displayed fewer errors than GPT-3.5. Generative AI tools may be used in climate literacy, a timely topic of importance, but must be scrutinized for potential biases and inaccuracies moving forward and considered in a social context. Future work should identify and disseminate best practices for optimal use across various generative AI tools.
How Can the Adversary Effectively Identify Cellular IoT Devices Using LSTM Networks?
Luo, Zhengping Jay; Pitera, Will; Zhao, Shangqing; Lu, Zhuo; Sagduyu, Yalin (ACM, 2023-06-01)
The Internet of Things (IoT) has become a key enabler for connecting edge devices with each other and the internet. Massive IoT services provided by cellular networks offer various applications such as smart metering and smart cities. Security of the massive IoT devices working alongside traditional devices such as smartphones and laptops has become a major concern. Protecting these IoT devices from being identified by malicious attackers is often the first line of defense for cellular IoT devices. In this paper, we provide an effective attacking method for identifying cellular IoT devices from cellular networks. Inspired by the characteristics of Long Short-Term Memory (LSTM) networks, we have developed a method that can not only capture context information but also adapt to the dynamic changes of the environment over time. Experimental validation shows a high detection rate with less than 10 epochs of training on public datasets.
How to Attack and Defend NextG Radio Access Network Slicing With Reinforcement Learning
Shi, Yi; Sagduyu, Yalin E.; Erpek, Tugba; Gursoy, M. Cenk (IEEE, 2023)
In this paper, reinforcement learning (RL) for network slicing is considered in next generation (NextG) radio access networks, where the base station (gNodeB) allocates resource blocks (RBs) to the requests of user equipments and aims to maximize the total reward of accepted requests over time. Based on adversarial machine learning, a novel over-the-air attack is introduced to manipulate the RL algorithm and disrupt NextG network slicing. The adversary observes the spectrum and builds its own RL based surrogate model that selects which RBs to jam subject to an energy budget with the objective of maximizing the number of failed requests due to jammed RBs. By jamming the RBs, the adversary reduces the RL algorithm's reward. As this reward is used as the input to update the RL algorithm, the performance does not recover even after the adversary stops jamming. This attack is evaluated in terms of both the recovery time and the (maximum and total) reward loss, and it is shown to be much more effective than benchmark (random and myopic) jamming attacks. Different reactive and proactive defense schemes such as suspending the RL algorithm's update once an attack is detected, introducing randomness to the decision process in RL to mislead the learning process of the adversary, or manipulating the feedback (NACK) mechanism such that the adversary may not obtain reliable information are introduced to show that it is viable to defend NextG network slicing against this attack, in terms of improving the RL algorithm's reward.
Improving Deep Learning for Maritime Remote Sensing through Data Augmentation and Latent Space
Sobien, Daniel; Higgins, Erik; Krometis, Justin; Kauffman, Justin; Freeman, Laura J. (MDPI, 2022-07-07)
Training deep learning models requires having the right data for the problem and understanding both your data and the models’ performance on that data. Training deep learning models is difficult when data are limited, so in this paper, we seek to answer the following question: how can we train a deep learning model to increase its performance on a targeted area with limited data? We do this by applying rotation data augmentations to a simulated synthetic aperture radar (SAR) image dataset. We use the Uniform Manifold Approximation and Projection (UMAP) dimensionality reduction technique to understand the effects of augmentations on the data in latent space. Using this latent space representation, we can understand the data and choose specific training samples aimed at boosting model performance in targeted under-performing regions without the need to increase training set sizes. Results show that using latent space to choose training data significantly improves model performance in some cases; however, there are other cases where no improvements are made. We show that linking patterns in latent space is a possible predictor of model performance, but results require some experimentation and domain knowledge to determine the best options.
Low-Latency Wireless Network Extension for Industrial Internet of Things
Fletcher, Michael; Paulz, Eric; Ridge, Devin; Michaels, Alan J. (MDPI, 2024-03-26)
The timely delivery of critical messages in real-time environments is an increasing requirement for industrial Internet of Things (IIoT) networks. Similar to wired time-sensitive networking (TSN) techniques, which bifurcate traffic flows based on priority, the proposed wireless method aims to ensure that critical traffic arrives rapidly across multiple hops to enable numerous IIoT use cases. IIoT architectures are migrating toward wirelessly connected edges, creating a desire to extend TSN-like functionality to a wireless format. Existing protocols possess inherent challenges to achieving this prioritized low-latency communication, ranging from rigidly scheduled time division transmissions, scalability/jitter of carrier-sense multiple access (CSMA) protocols, and encryption-induced latency. This paper presents a hardware-validated low-latency technique built upon receiver-assigned code division multiple access (RA-CDMA) techniques to implement a secure wireless TSN-like extension suitable for the IIoT. Results from our hardware prototype, constructed on the IntelFPGA Arria 10 platform, show that (sub-)millisecond single-hop latencies can be achieved for each of the available message types, ranging from 12 bits up to 224 bits of payload. By achieving one-way transmission of under 1 ms, a reliable wireless TSN extension with comparable timelines to 802.1Q and/or 5G is achievable and proven in concept through our hardware prototype.
Multi-Hop User Equipment (UE) to UE Relays for MANET/Mesh Leveraging 5G NR Sidelink
Shyy, DJ; Luu, Cuong; Xu, John D.; Liu, Lingjia; Erpek, Tugba; Gabay, David; Bate, David (ACM, 2023-12-06)
This paper provides use cases to adapt 5G sidelink technology to enable multi-hop User Equipment (UE)-to-UE (U2U) and UE-to- Network relaying in 3GPP standards. Such a capability could enable groups of users to communicate with each other when operating at the periphery or outside a network’s coverage area, with commercial and public safety benefits. This paper compares routing protocols to enable sidelink with U2U relay to support a Mobile Ad hoc Network (MANET). A gap analysis of current 3rd Generation Partnership Project (3GPP) Release 18 (R-18) specifications is performed to determine the missing procedures to enable multi-hop U2U relaying, along with a proposed candidate protocol to fill the gap. The candidate protocol can be submitted as a contribution to 3GPP TSG Service and System Aspects (SA) Working Group 2 (WG2) as proposed changes to the 5G architecture in 3GPP Release 19 (R-19).
Review: Is design data collection still relevant in the big data era? With extensions to machine learning
Freeman, Laura J. (Wiley, 2023-06)
Slowly but surely: Exposure of communities and infrastructure to subsidence on the US east coast
Ohenhen, Leonard; Shirzaei, Manoochehr; Barnard, Patrick L. (Oxford University Press, 2024-01-02)
Coastal communities are vulnerable to multihazards, which are exacerbated by land subsidence. On the US east coast, the high density of population and assets amplifies the region's exposure to coastal hazards. We utilized measurements of vertical land motion rates obtained from analysis of radar datasets to evaluate the subsidence-hazard exposure to population, assets, and infrastructure systems/facilities along the US east coast. Here, we show that 2,000 to 74,000 km² land area, 1.2 to 14 million people, 476,000 to 6.3 million properties, and >50% of infrastructures in major cities such as New York, Baltimore, and Norfolk are exposed to subsidence rates between 1 and 2 mm per year. Additionally, our analysis indicates a notable trend: as subsidence rates increase, the extent of area exposed to these hazards correspondingly decreases. Our analysis has far-reaching implications for community and infrastructure resilience planning, emphasizing the need for a targeted approach in transitioning from reactive to proactive hazard mitigation strategies in the era of climate change.
SpaceDrones 2.0 — Hardware-in-the-Loop Simulation and Validation for Orbital and Deep Space Computer Vision and Machine Learning Tasking Using Free-Flying Drone Platforms
Peterson, Marco; Du, Minzhen; Springle, Bryant; Black, Jonathan (MDPI, 2022-05-06)
The proliferation of reusable space vehicles has fundamentally changed how assets are injected into the low earth orbit and beyond, increasing both the reliability and frequency of launches. Consequently, it has led to the rapid development and adoption of new technologies in the aerospace sector, including computer vision (CV), machine learning (ML)/artificial intelligence (AI), and distributed networking. All these technologies are necessary to enable truly autonomous “Human-out-of-the-loop” mission tasking for spaceborne applications as spacecrafts travel further into the solar system and our missions become more ambitious. This paper proposes a novel approach for space-based computer vision sensing and machine learning simulation and validation using synthetically trained models to generate the large amounts of space-based imagery needed to train computer vision models. We also introduce a method of image data augmentation known as domain randomization to enhance machine learning performance in the dynamic domain of spaceborne computer vision to tackle unique space-based challenges such as orientation and lighting variations. These synthetically trained computer vision models then apply that capability for hardware-in-the-loop testing and evaluation via free-flying robotic platforms, thus enabling sensor-based orbital vehicle control, onboard decision making, and mobile manipulation similar to air-bearing table methods. Given the current energy constraints of space vehicles using solar-based power plants, cameras provide an energy-efficient means of situational awareness when compared to active sensing instruments. When coupled with computationally efficient machine learning algorithms and methods, it can enable space systems proficient in classifying, tracking, capturing, and ultimately manipulating objects for orbital/planetary assembly and maintenance (tasks commonly referred to as In-Space Assembly and On-Orbit Servicing). Given the inherent dangers of manned spaceflight/extravehicular activities (EVAs) currently employed to perform spacecraft maintenance and the current limitation of long-duration human spaceflight outside the low earth orbit, space robotics armed with generalized sensing and control and machine learning architecture have a unique automation potential. However, the tools and methodologies required for hardware-in-the-loop simulation, testing, and validation at a large scale and at an affordable price point are in developmental stages. By leveraging a drone’s free-flight maneuvering capability, theater projection technology, synthetically generated orbital and celestial environments, and machine learning, this work strives to build a robust hardware-in-the-loop testing suite. While the focus of the specific computer vision models in this paper is narrowed down to solving visual sensing problems in orbit, this work can very well be extended to solve any problem set that requires a robust onboard computer vision, robotic manipulation, and free-flight capabilities.
A statistical framework for domain shape estimation in Stokes flows
Borggaard, Jeffrey T.; Glatt-Holtz, Nathan E.; Krometis, Justin (IOP, 2023-08-01)
We develop and implement a Bayesian approach for the estimation of the shape of a two dimensional annular domain enclosing a Stokes flow from sparse and noisy observations of the enclosed fluid. Our setup includes the case of direct observations of the flow field as well as the measurement of concentrations of a solute passively advected by and diffusing within the flow. Adopting a statistical approach provides estimates of uncertainty in the shape due both to the non-invertibility of the forward map and to error in the measurements. When the shape represents a design problem of attempting to match desired target outcomes, this ‘uncertainty’ can be interpreted as identifying remaining degrees of freedom available to the designer. We demonstrate the viability of our framework on three concrete test problems. These problems illustrate the promise of our framework for applications while providing a collection of test cases for recently developed Markov chain Monte Carlo algorithms designed to resolve infinite-dimensional statistical quantities.
Training from Zero: Forecasting of Radio Frequency Machine Learning Data Quantity
Clark, William H.; Michaels, Alan J. (MDPI, 2024-07-18)
The data used during training in any given application space are directly tied to the performance of the system once deployed. While there are many other factors that are attributed to producing high-performance models based on the Neural Scaling Law within Machine Learning, there is no doubt that the data used to train a system provide the foundation from which to build. One of the underlying heuristics used within the Machine Learning space is that having more data leads to better models, but there is no easy answer to the question, “How much data is needed to achieve the desired level of performance?” This work examines a modulation classification problem in the Radio Frequency domain space, attempting to answer the question of how many training data are required to achieve a desired level of performance, but the procedure readily applies to classification problems across modalities. The ultimate goal is to determine an approach that requires the lowest amount of data collection to better inform a more thorough collection effort to achieve the desired performance metric. By focusing on forecasting the performance of the model rather than the loss value, this approach allows for a greater intuitive understanding of data volume requirements. While this approach will require an initial dataset, the goal is to allow for the initial data collection to be orders of magnitude smaller than what is required for delivering a system that achieves the desired performance. An additional benefit of the techniques presented here is that the quality of different datasets can be numerically evaluated and tied together with the quantity of data, and ultimately, the performance of the architecture in the problem domain.

Browse

Browsing Scholarly Works, Virginia Tech National Security Institute by Title

Results Per Page

Sort Options