This project contains the similarity between twitter's tweets using KMeans method and Jaccard Distance calculation and Random Points analysis using Euclidean Distance matrix.
Jaccard Distance
To run the code for the jaccard distance you should run the following commands in the command prompt.
javac *.java java KMeansAnalysis InitialSeeds.txt Tweets.json output.txt
=============================================================================
To run the code for the Euclidian Distance you should run the following commands in the command prompt.
javac *.java java KMeansAnalysis test_data.txt output.txt
- As I have run this code so many times using different value of k, I have included top 5 from it.
- They are mentioned below one by one.
================================================================================
NOTE: I haven't used any kind of external library to run this code.