Documentation

Python Files

cnn_scraper.py: Used to scrape CNN articles. Accesses the publicly available CNN API. fox_scraper.py: Used to scrape Fox News articles. Accesses the back-end API to scrape article information, then uses those links to access articles. reuters_scraper.py: Used to scrape Reuters articles. Scrapes list of articles and then uses those links to access articles. ibm_sentiment.py: Uses the IBM Watson Natural Language Understanding API to analyze the sentiment of each corpus of articles. Targets are set to analyze sentiment for keywords "Trump" and "Biden".

See inline comments for detailed descriptions of code functions. Note that the scrapers take a long time to run (sometimes upwards of 1-2 hours), while the sentiment analysis takes ~20-30 minutes to run.

Output Files

XXX_urls.csv: contains data on each corpus. All files contain the publish date, headline, url, and trump/biden sentiment scores. Depending on the source, may also include authors or category. XXX_scores.csv: output of ibm_sentiment for each file. This can be manually copy/pasted to the respective XXX_urls file to populate those columns. XXX_body.txt: output of XXX_scraper scripts. Each file contains one article per line, in the same order as the respective XXX_urls file.

Other File

chromedriver: necessary for selenium package to operate the scraper

API Documentation

The IBM Watson Natural Language Understanding API documentation can be found at https://cloud.ibm.com/apidocs/natural-language-understanding?code=python#sentiment

Other attributions

Select portions of the scraper codes have been adapted from MP2.1.

Video Demonstration

https://mediaspace.illinois.edu/media/t/1_ksj5ytyq

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
Progress Report.pdf		Progress Report.pdf
Proposal.pdf		Proposal.pdf
README.md		README.md
chromedriver.exe		chromedriver.exe
cnn_body.txt		cnn_body.txt
cnn_scores.csv		cnn_scores.csv
cnn_scraper.py		cnn_scraper.py
cnn_urls.csv		cnn_urls.csv
fox_body.txt		fox_body.txt
fox_links.txt		fox_links.txt
fox_scores.csv		fox_scores.csv
fox_scraper.py		fox_scraper.py
fox_urls.csv		fox_urls.csv
ibm_sentiment.py		ibm_sentiment.py
reuters_body.txt		reuters_body.txt
reuters_scores.csv		reuters_scores.csv
reuters_scraper.py		reuters_scraper.py
reuters_urls.csv		reuters_urls.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Documentation

Python Files

Output Files

Other File

API Documentation

Other attributions

Video Demonstration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Documentation

Python Files

Output Files

Other File

API Documentation

Other attributions

Video Demonstration

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages