You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A Java app classifying files with the extension of txt and doc through auto-parsing the contents of the file, and export the result to disk
(http://blog.csdn.net/yinchuandong2/article/details/17717449)
Tasks:
Chinese Word Segmenter/SVM classification algorithm/User interface design
Functions:
1. Allow user choose multi-files once from file system
2. Classify the files chosen by user
3. Export the result to disk by make different directories named after the class label
Technologies:
1. Crawl the corpus from different websites and preprocess by segmenting words
2. Transform the segmenting result into the training set of LibSvm and build model
3. Segment the content of target files and predict according to SVM model