#Simple KNN and Decision Tree Implementation on the 20newsgroup dataset
The link to the 20newsgroup dataset is as follows http://qwone.com/~jason/20Newsgroups/
Data is processed in order to get it into the tf matrix shape using R which is present as a R-Notebook
Once the tf-matrix is generated KNN and Decision Tree Classifiers are applied on it.
In order to run the classifiers before feature selection
Run
python full_feature.py
In order to run the classifers after feature selection, to calculate the best-K and score comparisons for various metrics
Run
python top_100.py