Skip to content

LeviJesus/data-engineering-zoomcamp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Data Engineering Zoomcamp

This repository contains materials, code, and projects from the Data Engineering Zoomcamp.

Overview

This repository tracks my learning journey through data engineering concepts, tools, and best practices. The zoomcamp covers various aspects of modern data engineering, including data ingestion, processing, orchestration, and analytics.

Topics Covered

  • Data Ingestion: Batch and streaming data collection
  • Data Processing: Data transformation and cleaning
  • Workflow Orchestration: Automating data pipelines
  • Data Warehousing: Building and managing data warehouses
  • Analytics Engineering: dbt and data modeling
  • Batch Processing: Large-scale data processing frameworks
  • Streaming: Real-time data processing
  • Infrastructure: Cloud platforms and containerization

Technologies

The zoomcamp typically covers tools such as:

  • Python
  • Docker
  • SQL
  • Apache Spark
  • Apache Kafka
  • dbt
  • Airflow/Prefect
  • Google Cloud Platform / AWS
  • Terraform

Structure

data-engineering-zoomcamp/
├── week-1/          # Module 1 materials
├── week-2/          # Module 2 materials
├── week-3/          # Module 3 materials
└── ...

Setup

Prerequisites

  • Python 3.x
  • Docker and Docker Compose
  • Git

Installation

  1. Clone the repository:
git clone https://github.com/LeviJesus/data-engineering-zoomcamp.git
cd data-engineering-zoomcamp
  1. Create a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies (when available):
pip install -r requirements.txt

Usage

Instructions for running specific modules and projects will be added as the course progresses.

Progress

  • Module 1: Introduction & Prerequisites
  • Module 2: Workflow Orchestration
  • Module 3: Data Warehouse
  • Module 4: Analytics Engineering
  • Module 5: Batch Processing
  • Module 6: Streaming
  • Final Project

Resources

Notes

This repository is for educational purposes and tracks personal learning progress through the Data Engineering Zoomcamp.

License

This project is for educational purposes.

Contact

For questions or feedback, please open an issue in this repository.

About

Repository with studies, tasks and projects related to the Data Engineering Zoomcamp

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages