RT @Pamela_Moore13: People starting to wake up! Black Americans Chant 'Hillary Is Racist' Outside Clinton Fundraiser in #Miami https://t.co

—missourinewsus, identified Russian troll

Inspiration

RussianNN was inspired by the recent controversy surrounding Russian meddling in the 2016 US presidential election. Using a dataset of deleted Russian tweets released by NBC and other tweets we found online, we decided to create a deep neural network to try to identify whether a tweet was released by a Russian troll.

What it does

The neural network is trained on the two datasets and utilizes three hidden layers. It converts 280-character tweets to single boolean values. The algorithm is trained using sigmoid, ReLU, and alpha decay techniques. In the application, the user can input a tweet and the network will identify whether or not it is similar to those produced by Russian trolls. Additionally, a username can be entered, and the application will test his/her most recent tweets.

How we built it

The neural network was built from scratch in Python using the linear algebra numpy library. In order to train the network, we wrote methods for forward propagation, accuracy testing, gradient descent, and other necessary functions. The application was built in Java using JFrame and also queries the Twitter API. In order to test a tweet using the neural network, the Java application executes a command in Python.

Challenges we ran into

Due to the complexity of the neural network, a large part of our time was spent ensuring our calculations and code were correct, especially the portions that required calculus and linear algebra to implement properly. We had to use vectorization in order to make the code run faster. Afterward, we spent time tuning the hyperparameters (learning rate, layer sizes, alpha decay, number of iterations, etc.) which was difficult because of the time required and multitude of possibilities. Ultimately, we aimed to improve the accuracy of the neural network, which was the overall challenge. In the Java application, we had to create our own custom layout in order to display our GUI as we wanted; in addition, we had to figure out how to execute Python from a Java application.

Accomplishments that we're proud of

Our neural network achieved a peak 85.15% accuracy on the test dataset, which we are proud of. We were able to develop a deep neural network that could fairly accurately detect Russian spam.

What we learned

We learned about the lower level design of a deep neural network through this hands-on experience, which is something we would not have been able to gain if we used a library to run the neural network for us. A lot of the process was trial and error and we were able to gain more practical knowledge of machine learning, in addition to practicing using calculus and linear algebra in a real-world application. We also learned about integrating a Java application with a Python script, as well as designing a GUI with a custom layout. Having never worked on projects like this before, this was a memorable and impactful learning experience for both of us.

What's next for RussianNN

In the future, we hope to improve the accuracy of our neural network through further training, increased size of datasets, more hyperparameter tuning, and by using a more computationally powerful machine to execute the matrix operations required. We also hope to convert the application into a Chrome extension in order to be more useful to users wary of Russian trolls online and into a more in-depth analysis tool for processing large datasets.

Share this project:

Updates