Uncertainty Quantification in Security Aware Data Pipelines
dc.contributor.author | Dadeboe, Alberta O. | en |
dc.contributor.committeechair | Ampadu, Paul K. | en |
dc.contributor.committeemember | Stavrou, Angelos | en |
dc.contributor.committeemember | Yi, Yang | en |
dc.contributor.department | Electrical and Computer Engineering | en |
dc.date.accessioned | 2025-06-13T12:46:21Z | en |
dc.date.available | 2025-06-13T12:46:21Z | en |
dc.date.issued | 2025-05-09 | en |
dc.description.abstract | With the recent rise in connected devices through the Internet of Things and interconnected cyberphysical systems, the diversity and volume of data have expanded. Proper management of sensitive information collected and processed through data pipelines is crucial. Traditional data pipelines usually perform error analysis of the final pipeline output after a detection model. As a result, they miss malicious attacks or data corruption that occur earlier in the pipeline. Providing assurance of security throughout all stages of pipeline processing can improve credibility at a more fine-grained level. This thesis introduces a combination of data pipeline augmentation capabilities aimed at estimating the uncertainty of computations with constant monitoring of trends in shifts in data at every pipeline stage. The proposed framework integrates uncertainty quantification (UQ), data provenance tracking, sensitivity analysis, and tunable alerts to understand parameter influence on function outputs, methodically detect potential corruptions, maintain a meticulous audit trail, and prompt observers during suspicious activity. This contribution advances conventional data pipeline anomaly detection by providing combined fault-sensitive execution and full-fault traceability with continuous estimation of uncertainty for each pipeline stage. | en |
dc.description.abstractgeneral | In practically every industry in the world, data is a commodity that is integrated in almost every operation. This data is constantly moving between different manipulations to make observations for drawing conclusions and assisting in decision making. The combined movement and manipulation constitutes data pipelines, with some manipulations mimicking real life systems for isolated testing and modification. Considering the fact that most data pipelines handle data from different sources and may be private or sensitive, the security of these pipelines is of utmost importance. Also, companies and individuals who rely on the outputs provided by data pipelines are in need of some kind of guarantee that the results are reliable. Just like surveillance precedes threat response in normal security operations, active monitoring of changes and trends in the data pipeline during operation is necessary for informed reactions and subsequently improvements to the data pipeline operation. This work explores the use of uncertainty quantification, a kind of confidence measure, to provide assurance on the reliability of every computation stage in a pipeline. The quantification of uncertainty provides all involved parties with some understanding of the quality of the outputs they receive from each data evaluation made within the pipeline. The architecture proposed in this work provides continuous feedback on how uncertainty and data statistics change from stage to stage in the pipeline. This coupled with an alert system ensures timely attention is drawn to potential attacks aimed at corrupting the data or computations of stages in the pipeline. | en |
dc.description.degree | Master of Science | en |
dc.description.sponsorship | STTR Phase I Contract No. 80NSSC24PB234 | en |
dc.format.medium | ETD | en |
dc.format.mimetype | application/pdf | en |
dc.identifier.uri | https://hdl.handle.net/10919/135504 | en |
dc.language.iso | en | en |
dc.publisher | Virginia Tech | en |
dc.rights | In Copyright | en |
dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
dc.subject | uncertainty quantification | en |
dc.subject | sensitivity analysis | en |
dc.subject | entropy | en |
dc.subject | provenance | en |
dc.subject | security | en |
dc.title | Uncertainty Quantification in Security Aware Data Pipelines | en |
dc.type | Thesis | en |
dc.type.dcmitype | Text | en |
thesis.degree.discipline | Computer Engineering | en |
thesis.degree.grantor | Virginia Polytechnic Institute and State University | en |
thesis.degree.level | masters | en |
thesis.degree.name | Master of Science | en |