Skip to content

moportilho/Wrangle-and-Analyze-Data

Repository files navigation

Wrangle-and-Analyze-Data

Udacity project for Wrangle and Analyze Data

In the real world, data rarely comes clean. I will be using Python and its libraries to gather data from various sources and in different formats. My task will be to assess the quality and tidiness of the data and then proceed to clean it. This process is referred to as data wrangling. I will document all my efforts in a Jupyter Notebook and showcase them through analyses and visualizations using Python (and its libraries) and/or SQL.

The dataset that I will be wrangling, analyzing, and visualizing is the tweet archive of my favorite Twitter user, @dog_rates, also known as WeRateDogs. I adore this Twitter account that rates people's dogs with humorous comments about the dogs. What's interesting is that these ratings almost always have a denominator of 10. However, the numerators are almost always greater than 10. For instance, 11/10, 12/10, 13/10, and so on. The reason? Because "they're good dogs Brent." WeRateDogs has a massive following of over 4 million and has received media coverage worldwide.

WeRateDogs kindly shared their Twitter archive with Udacity, who passed it along to me via email for exclusive use in this project. This archive provides basic tweet data (tweet ID, timestamp, text, etc.) for all 5000+ of their tweets as they appeared on August 1, 2017. I'll be delving into this archive soon to extract valuable insights.

Project Steps Overview Your tasks in this project are as follows:

Step 1: Gathering data

Step 2: Assessing data

Step 3: Cleaning data

Step 4: Storing data

Step 5: Analyzing, and visualizing data

Step 6: Reporting

your data wrangling efforts your data analyses and visualizations

About

Udacity project for Wrangle and Analyze Data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors