Magic Keys

Magic Keys is a simple predictive text application that works pretty much like a cell phone's keyboard, making suggestion about the next word to be entered when writing an email or replaying a message. It is written in R and its Shiny package in order to build the interactive web application. This project was developed to complete the Capstone Project from the Data Science Specialization at Coursera. It is inspired by other works like Word Psychic and Next word prediction. To see more details about the n-gram model employed, you can have a look to this short presentation.

Dependencies

Install the dependencies for R:
1. utils: to unzip files
  suppressWarnings(install.packages("utils")
2. qdapRegex: regular expression removal, extraction, and replacement tools to clean training.
  setsuppressWarnings(install.packages("qdapRegex"))
3. tm: basic framework for text mining applications within R.
  suppressWarnings(install.packages("tm"))
4. slam: to compute frequencies from tm Term-Document Matrices.
  suppressWarnings(install.packages("slam"))
5. textreg: to convert tm corpus into character vector.
  suppressWarnings(install.packages("textreg"))
6. parallel: for parallel computation.
  suppressWarnings(install.packages("parallel"))
7. RWeka: to tokenize words from text.
  suppressWarnings(install.packages("RWeka"))
8. stringr: to split columns from matrix as part of the process to make ngrams.
  suppressWarnings(install.packages("stringr"))
9. digest: to apply cryptographical hash functions to benchmark text.
  suppressWarnings(install.packages("digest"))
10. data.table: for faster data manipulation.
  suppressWarnings(install.packages("data.table"))
11. shiny: for compile web apps on R Studio servers.
  suppressWarnings(install.packages("shiny"))
12. DT: to display R dataframes as tables on HTML pages.
  suppressWarnings(install.packages("DT"))
About RWeka and Mac OS. There seem to be a little problem between RWeka and Java on Mac OS. To solve it try this:
1. On your terminal:
  sudo R CMD javareconf
2. On R:
  install.packages("rJava",type='source')
3. On terminal:
  sudo ln -f -s $(/usr/libexec/java_home)/jre/lib/server/libjvm.dylib /usr/local/li

Future work

There is still much work to be done in relation to the n-gram model. Firs of all, the corpus should be augmented with more texts from areas beyond news. Second, state-of-the-art models are nowadays based on Deep Learning. It is worth to explore such DL models.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
app		app
data		data
doc		doc
results		results
.gitignore		.gitignore
CITATION.txt		CITATION.txt
LICENSE.txt		LICENSE.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Magic Keys

Dependencies

Future work

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Magic Keys

Dependencies

Future work

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages