Privacy Preserving Network Security Data Analytics
DeYoung, Mark E.
MetadataShow full item record
The problem of revealing accurate statistics about a population while maintaining privacy of individuals is extensively studied in several related disciplines. Statisticians, information security experts, and computational theory researchers, to name a few, have produced extensive bodies of work regarding privacy preservation. Still the need to improve our ability to control the dissemination of potentially private information is driven home by an incessant rhythm of data breaches, data leaks, and privacy exposure. History has shown that both public and private sector organizations are not immune to loss of control over data due to lax handling, incidental leakage, or adversarial breaches. Prudent organizations should consider the sensitive nature of network security data and network operations performance data recorded as logged events. These logged events often contain data elements that are directly correlated with sensitive information about people and their activities -- often at the same level of detail as sensor data. Privacy preserving data publication has the potential to support reproducibility and exploration of new analytic techniques for network security. Providing sanitized data sets de-couples privacy protection efforts from analytic research. De-coupling privacy protections from analytical capabilities enables specialists to tease out the information and knowledge hidden in high dimensional data, while, at the same time, providing some degree of assurance that people's private information is not exposed unnecessarily. In this research we propose methods that support a risk based approach to privacy preserving data publication for network security data. Our main research objective is the design and implementation of technical methods to support the appropriate release of network security data so it can be utilized to develop new analytic methods in an ethical manner. Our intent is to produce a database which holds network security data representative of a contextualized network and people's interaction with the network mid-points and end-points without the problems of identifiability.
- Doctoral Dissertations