This project analyzes intrusion detection using the KDD99 dataset, which features 42 attributes and 23 attack labels. The analysis focuses on identifying key patterns and indicators for network security. Major attack types include Smurf, Neptune, and Normal, while other attacks are relatively infrequent. To simplify, attacks were categorized into Smurf, Neptune, Normal, and Others. Machine learning models, specifically neural networks, were trained to differentiate between these categories. The results are visualized using Power BI, presenting a clear overview of dataset distributions and model performance. A confusion matrix highlights True Positives (TP), False Positives (FP), and False Negatives (FN) for each category, with insights into areas needing improvement for varying attack frequencies.
This project demonstrates the effectiveness of machine learning in cybersecurity and the clarity of data visualization through Power BI. The code containing the full project can be found in the repo.
The nature and patterns of different attack combinations in network data were grouped based on their similarities. High dimensionality was reduced to lower-dimensional components to capture trends and patterns in attack types. The classification methods include:
A sequential neural network was trained on the model, and the results are displayed in a confusion matrix for the different attack types that:
Recurrent neural network was trained and the results in comparison to SNN shows the confusion matrix plots:
Similarly, the Multilayer Perceptron generated a confusion matrix showing false positives, true positives, and false negatives for the top attack types.
A comparative analysis of the three different machine learning models was conducted using network data. The visualization results include:
The most significant factors influencing the detection of intrusions were analyzed. This visualization highlights::