A comprehensive SQL data analysis project examining global deforestation trends from 1990 to 2016 using World Bank data. This project was completed as part of the Udacity Data Analysis curriculum.
ForestQuery, a non-profit organization focused on reducing global deforestation, requires analysis of World Bank data to identify:
- Countries and regions with shrinking forests
- Areas with the most significant forest coverage (by amount and percentage)
- Trends to inform initiatives, communications, and resource allocation
deforestation_exploration/
├── sql/ # SQL query files
│ ├── 01_create_forestation_view.sql # Creates main analytical view
│ ├── 02_global_situation_queries.sql # Global deforestation metrics
│ ├── 03_regional_outlook_queries.sql # Regional forest comparisons
│ └── 04_country_level_detail_queries.sql# Country-specific analysis
├── results/ # Query output files
│ ├── q2_results.csv # Global situation results
│ ├── q3_results.csv # Regional outlook results
│ └── q4_results.csv # Country-level detail results
├── docs/ # Documentation and reports
│ ├── template.md # Original report template
│ └── deforestation_report.html # Final HTML report (print to PDF)
├── sqlwidget/dbs/ # Database files
│ └── forest_query.sql # Source database with all tables
└── README.md # This file
The project uses three primary tables from the World Bank:
country_code- Country identifiercountry_name- Country nameyear- Year (1990-2016)forest_area_sqkm- Forest area in square kilometers
country_code- Country identifiercountry_name- Country nameyear- Year (1990-2016)total_area_sq_mi- Total land area in square miles
country_code- Country identifiercountry_name- Country nameregion- Geographic regionincome_group- World Bank income classification
Run the view creation script first:
sql/01_create_forestation_view.sqlThis creates a unified forestation view that:
- Joins all three tables (forest_area, land_area, regions)
- Converts square miles to square kilometers (1 sq mi = 2.59 sq km)
- Calculates percent of land area designated as forest
Execute each query file in order:
sql/02_global_situation_queries.sql # Save output as results/q2_results.csv
sql/03_regional_outlook_queries.sql # Save output as results/q3_results.csv
sql/04_country_level_detail_queries.sql # Save output as results/q4_results.csvEach query returns comprehensive results needed for that section of the report.
- 1990 Forest Area: 41,282,694.9 sq km
- 2016 Forest Area: 39,958,245.9 sq km
- Loss: 1,324,449 sq km (3.21%)
- Equivalent: Approximately the entire land area of Peru
- Highest Forestation (2016): Latin America & Caribbean (46.16%)
- Lowest Forestation (2016): Middle East & North Africa (2.07%)
- Regions with Decline: Latin America & Caribbean (-4.87%), Sub-Saharan Africa (-3.89%)
Success Stories:
- China: +527,229 sq km increase
- Iceland: +213.66% increase
Greatest Concerns:
- Brazil: -541,510 sq km (largest absolute loss)
- Togo: -75.45% (largest percentage loss)
- Nigeria: Appears in both top 5 lists (absolute and percent loss)
- 0-25% forest: 85 countries
- 25-50% forest: 72 countries
- 50-75% forest: 38 countries
- 75-100% forest: 9 countries
The project demonstrates proficiency in:
- ✅ Views - Creating analytical views with JOIN operations
- ✅ JOINs - INNER JOIN, self-joins for temporal comparisons
- ✅ Aggregations - SUM, COUNT, GROUP BY
- ✅ Window Functions - ROUND, ABS
- ✅ CTEs - Common Table Expressions for complex queries
- ✅ CASE Statements - Quartile categorization
- ✅ Subqueries - Nested SELECT statements
- ✅ Boolean Operators - Complex WHERE conditions with AND/OR
- ✅ ORDER BY & LIMIT - Result sorting and filtering
- ✅ UNION ALL - Combining multiple result sets
The final report includes:
- GLOBAL SITUATION - World forest area changes from 1990-2016
- REGIONAL OUTLOOK - Regional forest percentages and trends
- COUNTRY-LEVEL DETAIL - Success stories, concerns, and quartile analysis
- RECOMMENDATIONS - Strategic recommendations for ForestQuery
- APPENDIX - All SQL queries used in the analysis
Based on the analysis, recommended resource allocation:
- 40% - Sub-Saharan Africa (critical intervention needed)
- 35% - Latin America & Caribbean (large-scale impact opportunity)
- 15% - East Asia & Pacific (knowledge transfer from China's success)
Priority Countries:
- Nigeria (appears in both top 5 concern lists)
- Brazil (largest absolute forest loss)
- Togo, Uganda, Mauritania (highest percentage losses)
Success Story Studies:
- China's reforestation policies (527,229 sq km increase)
- Iceland's forest growth strategies (213.66% increase)
Open the final report in your browser:
docs/deforestation_report.htmlPrint to PDF using your browser's print function (Ctrl+P or Cmd+P).
✅ Created forestation view joining all three tables
✅ Calculated percent forest area with proper unit conversion
✅ Answered all questions in Global Situation section
✅ Answered all questions in Regional Outlook section
✅ Answered all questions in Country-Level Detail section
✅ Provided strategic recommendations
✅ Included all SQL queries in appendix
✅ Followed SQL formatting best practices
✅ Used complete sentences in report
✅ Created professional PDF-ready report
- SQL - PostgreSQL dialect
- Database - World Bank deforestation data (1990-2016)
- Export Format - CSV for results, HTML for final report
Christopher Aaron O'Hara Udacity Data Analysis Nanodegree
This project is part of the Udacity Data Analysis curriculum.
Data Source: World Bank - Forest Area and Land Area Statistics (1990-2016)