Skip to content

MiaZhou112/etl-pipeline

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Python Version

Tested with Python 3.10.10

Project Overview

This is a simple Python-based ETL pipeline with three main components:

  • Extractor: Reads data from a source (mysql database)
  • Transformer: Cleans and transforms the data, including joining multiple tables and updating date types
  • Loader: Loads the data into different locations (mysql database, local CSV)

File Structure

  • etl.py: Contains the ETL classes
  • config.loader.py: Contain the function for loading Database credentials from YAML
  • main.py: Main script to run the ETL process
  • data/: Sample input/output files
  • configs/db_config.yaml: Stores database connection info like host, user, password

How to Run

pip install -r requirements.txt
python main.py

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages