Financial Data ETL Pipeline

An end-to-end Data Engineering pipeline that extracts live financial data, performs automated transformations, and loads it into a persistent SQL database.

Project Overview

As a 3rd-year Computer Engineering student at Istanbul Aydin University, I developed this project to demonstrate core Data Engineering principles. The system automates the flow of market data, ensuring data integrity and persistence—skills that are essential for data-driven sectors like banking and finance.

Tech Stack

Language: Python 3.x
Libraries: Pandas, SQLAlchemy, yfinance
Database: SQLite (Relational Storage)
Workflow: ETL (Extract, Transform, Load)

How It Works

1. Extract

Using the yfinance API, the system retrieves 1-minute interval market data (e.g., BTC-USD) for the last 24 hours.

Error Handling: Implemented try-except blocks to ensure the pipeline remains robust during API connection issues.

2. Transform

Raw data is processed using Pandas to meet production-ready standards:

Cleaning: Automatically removes unnecessary columns like Stock Splits and Dividends.
Normalization: Converts column names to lowercase for seamless SQL compatibility.
Metadata: Adds ticker symbols and ingested_at timestamps to maintain a clear Audit Trail.
Timezone Handling: Standardizes timestamps by removing UTC offsets to ensure database consistency.

3. Load

The cleaned data is streamed into a SQLite database (market_data.db) using SQLAlchemy.

Persistence: Utilizes if_exists='append' logic to build a continuous time-series dataset, preventing data loss across multiple runs.

Results & Validation

During testing, the pipeline successfully processed and stored over 550+ records with 100% data integrity. This architecture serves as a foundation for more complex projects, such as my current work on Hybrid Energy Potential Assessment.

Getting Started

Clone the repository:

git clone [https://github.com/silapeksen/financial-data-pipeline.git](https://github.com/silapeksen/financial-data-pipeline.git)

Install dependencies:
```
pip install -r requirements.txt
```
Run the pipeline:
```
python data_ingestion.py
```
Verify the database:
```
python check_db.py
```

👤 Author

3rd Year Computer Engineering Student
Passionate about Data Engineering, Automation, and Backend Systems.

"Turning raw data into meaningful insights, one pipeline at a time."

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
check_db.py		check_db.py
data_ingestion.py		data_ingestion.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Financial Data ETL Pipeline

Project Overview

Tech Stack

How It Works

1. Extract

2. Transform

3. Load

Results & Validation

Getting Started

👤 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Financial Data ETL Pipeline

Project Overview

Tech Stack

How It Works

1. Extract

2. Transform

3. Load

Results & Validation

Getting Started

👤 Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages