- Assessing the Quality of Data
- Intro to Tabular Formats
- Parsing CSV
- Parsing XLS with XLRD
- Intro to JSON
- Using Web APIs
- Intro to XML
- XML Design Principles
- Parsing XML
- Web Scraping
- Parsing HTML
- What is Data Cleaning?
- Sources of Dirty Data
- Measuring Data Quality
- A Blueprint for Cleaning
- Auditing Validity
- Auditing Accuracy
- Auditing Completeness
- Auditing Consistency
- Auditing Uniformity
- Data Modeling in MongoDB
- Introduction to PyMongo
- Field Queries
- Projection Queries
- Getting Data into MongoDB
- Using mongoimport
- Operators like $gt, $lt, $exists, $regex
- Querying Arrays and using $in and $all Operators
- Changing entries: $update, $set, $unset
- Examples of Aggregation Framework
- The Aggregation Pipeline
- Aggregation Operators: $match, $project, $unwind, $group
- Multiple Stages Using a Given Operator
- Using iterative parsing for large data files
- Open Street Map XML Overview
- Exercises around OpenStreetMap data
- Use important skills from data munging to improve OpenStreetMaps data for a part of the world that you care about and give back to the community.
- You can find the final project grading rubric here. It also contains final project submission instructions for students who are enrolled in the full course experience.
REFERENCE:
[1]Requests
[4]PyMongo