Skip to content

AntonioSerrano/Magic-Keys-a-predictive-text-application-with-R

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Magic Keys

Magic Keys is a simple predictive text application that works pretty much like a cell phone's keyboard, making suggestion about the next word to be entered when writing an email or replaying a message. It is written in R and its Shiny package in order to build the interactive web application. This project was developed to complete the Capstone Project from the Data Science Specialization at Coursera. It is inspired by other works like Word Psychic and Next word prediction. To see more details about the n-gram model employed, you can have a look to this short presentation.

license


Dependencies

  1. Install the dependencies for R:

    1. utils: to unzip files
      suppressWarnings(install.packages("utils")
    2. qdapRegex: regular expression removal, extraction, and replacement tools to clean training.
      setsuppressWarnings(install.packages("qdapRegex"))
    3. tm: basic framework for text mining applications within R.
      suppressWarnings(install.packages("tm"))
    4. slam: to compute frequencies from tm Term-Document Matrices.
      suppressWarnings(install.packages("slam"))
    5. textreg: to convert tm corpus into character vector.
      suppressWarnings(install.packages("textreg"))
    6. parallel: for parallel computation.
      suppressWarnings(install.packages("parallel"))
    7. RWeka: to tokenize words from text.
      suppressWarnings(install.packages("RWeka"))
    8. stringr: to split columns from matrix as part of the process to make ngrams.
      suppressWarnings(install.packages("stringr"))
    9. digest: to apply cryptographical hash functions to benchmark text.
      suppressWarnings(install.packages("digest"))
    10. data.table: for faster data manipulation.
      suppressWarnings(install.packages("data.table"))
    11. shiny: for compile web apps on R Studio servers.
      suppressWarnings(install.packages("shiny"))
    12. DT: to display R dataframes as tables on HTML pages.
      suppressWarnings(install.packages("DT"))
  2. About RWeka and Mac OS. There seem to be a little problem between RWeka and Java on Mac OS. To solve it try this:

    1. On your terminal:
      sudo R CMD javareconf
    2. On R:
      install.packages("rJava",type='source')
    3. On terminal:
      sudo ln -f -s $(/usr/libexec/java_home)/jre/lib/server/libjvm.dylib /usr/local/li

Future work

There is still much work to be done in relation to the n-gram model. Firs of all, the corpus should be augmented with more texts from areas beyond news. Second, state-of-the-art models are nowadays based on Deep Learning. It is worth to explore such DL models.

About

Magic Keys is a predictive text application developed to complete the Capstone Project in the Data Science Specialization offered by Johns Hopkins University in collaboration with Swiftkey via Coursera.org

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages