Word-Embeddings (In progress)

This project combines webscrapping, the gensim word embedding libary, and vector analysis to collect large text corpuses, build a novel word embedding, and quantify certain asepects of the embeddings dimensionality.

Specifically my goal was to analyze racial, gender, and religious bias in word embedding models constructed from different media corpuses I am actively collecting. Fox News, CNN, and MSNBC are the three media groups I am currently collecting data on. From there I have build a program to parse the text corpuses, build word vector spaces, and then programatically analyze the racial biases present in the language used on air.

From my findings I hope to illuminate the underlying biases present in specific news media not through single instances of bias, but rather by quantifying the structure of the language used on air and the bias mathematically defined within it.

(Updates to text corpuses and analysis will continue through August 2017)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
Data		Data
.DS_Store		.DS_Store
ANALYZING THE RACIAL DIMENSION IN FOX NEWS.docx		ANALYZING THE RACIAL DIMENSION IN FOX NEWS.docx
Fox_5_Model_1		Fox_5_Model_1
Hannity_Model_1		Hannity_Model_1
News_Vectorizer.py		News_Vectorizer.py
README.md		README.md
Word2Vec.py		Word2Vec.py
geckodriver.log		geckodriver.log
textMiner.py		textMiner.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Word-Embeddings (In progress)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Word-Embeddings (In progress)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages