Intrusion Detection Analysis

Intrusion Detection Dashboard: KDD99 Dataset Analysis

This project analyzes intrusion detection using the KDD99 dataset, which features 42 attributes and 23 attack labels. The analysis focuses on identifying key patterns and indicators for network security. Major attack types include Smurf, Neptune, and Normal, while other attacks are relatively infrequent. To simplify, attacks were categorized into Smurf, Neptune, Normal, and Others. Machine learning models, specifically neural networks, were trained to differentiate between these categories. The results are visualized using Power BI, presenting a clear overview of dataset distributions and model performance. A confusion matrix highlights True Positives (TP), False Positives (FP), and False Negatives (FN) for each category, with insights into areas needing improvement for varying attack frequencies.

This project demonstrates the effectiveness of machine learning in cybersecurity and the clarity of data visualization through Power BI. The code containing the full project can be found in the repo.

Intrusion Detection Dashboard: KDD99 Dataset Analysis

Classification Techniques

The nature and patterns of different attack combinations in network data were grouped based on their similarities. High dimensionality was reduced to lower-dimensional components to capture trends and patterns in attack types. The classification methods include:

Unsupervised k-Means Clustering
Principal Component Analysis (PCA)

Machine Learning with Sequential neural network (SNN)

A sequential neural network was trained on the model, and the results are displayed in a confusion matrix for the different attack types that:

It displays high, medium, and low values of false positives, true positives, and false negatives for the top attack types.

Machine Learning with RNN

Recurrent neural network was trained and the results in comparison to SNN shows the confusion matrix plots:

Displayig high, medium, and low values of false positives, true positives, and false negatives for the top attack types.

Machine Learning with MLP

Similarly, the Multilayer Perceptron generated a confusion matrix showing false positives, true positives, and false negatives for the top attack types.

Performance and Metrics

A comparative analysis of the three different machine learning models was conducted using network data. The visualization results include:

Accuracy of models for each attack tpye
False positive and false negative rates

Key Influencers

The most significant factors influencing the detection of intrusions were analyzed. This visualization highlights::

Factors that trigger and influence increases in attacks
High false positive and false negative rates in the nature of attack patterns