VTechWorks staff will be away for the winter holidays starting Tuesday, December 24, 2024, through Wednesday, January 1, 2025, and will not be replying to requests during this time. Thank you for your patience, and happy holidays!
 

Embedding Network Information for Machine Learning-based Intrusion Detection

TR Number

Date

2019-01-18

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

As computer networks grow and demonstrate more complicated and intricate behaviors, traditional intrusion detections systems have fallen behind in their ability to protect network resources. Machine learning has stepped to the forefront of intrusion detection research due to its potential to predict future behaviors. However, training these systems requires network data such as NetFlow that contains information regarding relationships between hosts, but requires human understanding to extract. Additionally, standard methods of encoding this categorical data struggles to capture similarities between points. To counteract this, we evaluate a method of embedding IP addresses and transport-layer ports into a continuous space, called IP2Vec. We demonstrate this embedding on two separate datasets, CTU'13 and UGR'16, and combine the UGR'16 embedding with several machine learning methods. We compare the models with and without the embedding to evaluate the benefits of including network behavior into an intrusion detection system. We show that the addition of embeddings improve the F1-scores for all models in the multiclassification problem given in the UGR'16 data.

Description

Keywords

word embeddings, intrusion detection

Citation

Collections