Naveed Mohiuddin NaveedMohiuddin

Hi, I'm Naveed Mohiuddin 👋

Data Engineer · AWS & Azure Certified · Chicago, IL
Building scalable cloud-native data platforms, ETL pipelines, and lakehouse architectures.

About

Data Engineer with experience building production data platforms across AWS and Azure. I design serverless ETL pipelines, optimize distributed processing with PySpark, model dimensional data, and automate everything from orchestration to deployment.

Currently: Data Engineer at Benda Infotech — building serverless AWS data pipelines (S3 → Lambda → Glue → Redshift) processing 1M+ records daily.

Previously: Software Engineer (Data Engineering) at Applied Information Sciences, building Azure-based data infrastructure for GEICO — ADF, Databricks, Delta Lake, Kafka, and Synapse across 10M+ records.

Education: MS in Computer Science from Illinois Institute of Technology · BE in Computer Science from Osmania University.

Certifications

AWS Certified Data Engineer – Associate
AWS Certified Solutions Architect – Associate
Microsoft Certified Azure Fundamentals

Tech Stack

Area	Technologies
Data Engineering	ETL/ELT · Data Modeling · Star Schema · SCD Type 2 · Medallion Architecture · Batch & Streaming
Cloud	AWS (S3, Glue, Redshift, Lambda, Athena, EventBridge) · Azure (ADF, Databricks, Synapse, ADLS Gen2)
Big Data	Apache Spark · PySpark · Kafka · Delta Lake · Iceberg · HDFS · Hadoop · Hive
Orchestration	Apache Airflow · CI/CD · Azure DevOps · Docker · Git
Languages	Python · SQL · Java · Linux
Analytics	Power BI · Athena · Synapse · Redshift

Featured Projects

AWS Lakehouse & Analytics Platform
Serverless data lakehouse on AWS analyzing 100K+ Chicago crime records. Glue PySpark ETL, Athena analytics with Apache Iceberg, dimensional models in Redshift, Airflow orchestration.
AWS S3 Glue Athena Redshift Airflow PySpark Iceberg Lake Formation

ETL Weather Pipeline with Airflow
End-to-end ETL pipeline ingesting weather data from Open-Meteo API, transforming and loading into PostgreSQL using Airflow DAGs in Docker.
Apache Airflow PostgreSQL Docker Python REST API

AWS Multi-AZ Disaster Recovery
Production-style e-commerce order system on AWS with Multi-AZ RDS, custom VPC networking, and automated failover.
RDS EC2 VPC NAT Gateway Multi-AZ

Big Data Processing with Spark
Large-scale data processing on GCP Dataproc using Spark DataFrames with optimized joins, schema validation, and aggregation.
Apache Spark HDFS GCP Dataproc SQL

What I'm Looking For

Actively open to Data Engineer, Cloud Data Engineer, and AWS Data Engineer roles. If my background fits your team's needs, I'd love to connect.

📧 [email protected] · LinkedIn · Portfolio

Provide feedback

Saved searches

Use saved searches to filter your results more quickly