Skip to content

a-set/textProcessing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

textProcessing V1.0

#Features This is a basic Text processing API. It provides the following features

  1. It provides a Tokenizer class and Word Frequency calculator which has helper methods to 1a. Tokenize entire file into word lists and word frequency maps given file name 1b. Tokenize sentences into words

  2. It provides TwoGramGenerator class which has helper methods to 2a. Generate all two grams in a file , given file name 2b. Give frequency map of two grams

  3. It provides a PalindromeGenerator which scans a file and generates frequency distribution of all palindromes contained in a file.

#HowToUse The repository has a set of testing methods within the testing package which demonstrate the use of all methods on an sample input.txt file

About

This is a basic Text processing API that allows one to tokenize files into words and sentences, generate frequency distributions of word occurrences , two gram occurrences. It also includes a palindrome detector which generates frequency distributions of palindromes occurring in a file.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages