This tutorial(hopefully) will try and present a more "VERBOSE VERSION TO TEXT CLASSIFICATION" and discuss few libraries, techniques and hacks that could come in handy while working on large scale Text Classification problems. The ipython notebook "A Noob's guide to text classification " is the first part of a series of tutorials on Text Classification using Python and Friends. The first part focuses mainly on using scikit-learn and gensim to construct models to classify texts from the Reuters-21578 benchmark corpus.
Click here for the
Unrendered Version
Click here for the complete Rendered Version