Inspiration

I planned to develop something in the cybersecurity space while brainstorming. During my brainstorming phase, I realized that there are many antivirus apps that work through the network, using a method known as packet sniffing. However, very few of these apps actually use machine learning to detect when a packet may be more likely to be malicious.

What it does

PacketGuard is a machine learning-based network antivirus. It continuously checks every single packet sent from and being received on your device. It then uses a pre-trained machine learning model to predict the likelihood of each packet being malicious. If it deems a packet to be malicious, it informs the user promptly. It also includes several additional functionalities, such as an easy-to-use dashboard, blocking malicious packets (on top of flagging them), and creating a custom blacklist of IPs and ports.

How we built it

I built the project with a combination of Python and Javascript. Originally, I had planned to use C++ for the packet sniffing, as this seemed much faster than Python. However, after realizing that it would be difficult to integrate Python, C++, and Javascript in a single project, I chose to just use Python and JS. Python was also used to train the machine learning model. The model is built on 12,000,000 packets, already pre-classified. The datasets were downloaded from Kaggle and can be found below. This model is saved to the repository and is used to predict maliciousness for incoming packets via a different Python program. Finally, an Electron app is created using Javascript to display all of the data in a neat fashion.

Kaggle Dataset 1: https://www.kaggle.com/datasets/advaitnmenon/network-traffic-data-malicious-activity-detection Kaggle Dataset 2: https://www.kaggle.com/datasets/agungpambudi/network-malware-detection-connection-analysis

Challenges we ran into

Creating the model and training it was very technically challenging as, although I have worked with training AI models in the past, I had never worked with a dataset as challenging as this. Due to the sheer size of the dataset, training models took up to 30 minutes each time, meaning for 30 minutes, I was unable to work on that part of the project. Because I was also working solo, this made the project much more challenging.

Accomplishments that we're proud of

Although I faced the challenges above, I was able to get past them after many hours of work. Training the machine learning model was very challenging, but I was able to use that time to improve on and code the prediction program while it was training. I was also able to learn a lot more about networks and cybersecurity with this project, which I'm proud of.

What we learned

Overall, I learned a lot more about cybersecurity and data transfer protocols specifically, and how to effectively shield a device from malicious packets. In addition, I was able to learn a lot about the integration of machine learning and cybersecurity, which was difficult to learn without hands-on experience, as there is very little information about this topic online.

Built With

Share this project:

Updates