Inspiration
We are fascinated by the languages of the world and are always interested in learning about new languages! We wanted to combine this interest with machine learning to help settle our curiosity whenever we stumble upon an unknown language.
What it does
It takes a sentence as an input, and it returns the language of the sentence as an ouput.
How we built it
We used a bag of words and Naive Bayes to train and classify sentences into their appropriate language.
Challenges we ran into
Our dataset is very big (contains sentences in 235 languages). Our model was not efficient enough to train on all 235 languages, so instead we reduced the data to the ten most common languages, which allowed us to have an efficient and highly accurate model.
Accomplishments that we're proud of
We have a very high accuracy on our test set!
What we learned
We learned how to apply machine learning to solve a problem that we encounter
What's next for Language Identification
We can increase the number of languages that our model can detect. We can also improve the design of our front end
Log in or sign up for Devpost to join the conversation.