About Me
Hi, I am a Data Engineering leader with 18+ years of experience designing and operating large-scale distributed data platforms across telecom and enterprise environments.
At My Current Company, I have led data platforms operating at 100PB+ scale across 800+ nodes, managing 100+ pipelines and leading a team of 10 engineers.
I have driven measurable impact by improving platform SLA from 65% to 87% through Spark optimization and platform reliability initiatives.
My expertise lies in Spark, Hadoop ecosystem, enterprise data architecture, and large-scale data ingestion systems.
I am currently focused on evolving toward modern data platforms including:
- Lakehouse architectures (Delta Lake, Databricks)
- Real-time streaming systems (Kafka, Spark Structured Streaming)
- Scalable and governed data platforms — metadata, lineage, and policy driven controls
I stay current with evolving data platform practices and remain committed to continuous learning and professional development.
When I’m not working on exciting data projects, you can find me reading books. I believe that balance is key to a fulfilling life, both personally and professionally.
Skills
Core Expertise
- Data Engineering Leadership
- Big Data Platforms
- Distributed Data Systems
- Spark / PySpark
- Hadoop Ecosystem (Hive, HDFS, etc )
- Object Storage Systems (Apache Ozone)
- Data Architecture
- Enterprise Data Modeling
- Batch Data Pipelines
- Large-Scale Data Ingestion
- Data Processing Optimization
- Data Governance
- Consent & Privacy Controls
- Platform Operations
- Technical Roadmapping
- Cross-Functional Leadership
- BI & Analytics Enablement
- Lakehouse Concepts
- Databricks Ecosystem
- Streaming Platform Fundamentals
- Real-Time Data Processing Concepts
Technical Stack
- Big Data & Processing: Hadoop, HDFS, Apache Ozone, Spark, PySpark
- Programming & Querying: Python, SQL, PL/SQL, Shell Scripting
- Data Architecture: Enterprise Data Modeling, Data Integration, Batch Processing, Data Lifecycle Management
- Analytics & Reporting: BI Analytics, Data Transformation, Data Correlation, Dashboarding Support
- Platform & Operations: Cluster Management, Data Platform Operations, Scheduling with Azkaban,Airflow & NiFi
- Modern Platform Alignment: Databricks concepts, Delta Lake concepts, streaming architecture fundamentals, lakehouse architecture fundamentals
- Databases / Warehousing: Oracle, MySQL, Hive
Modern Platform/Stack Alignment (Learning)
- Streaming Architecture fundamentals including Kafka and Spark Structured Streaming
- Lakehouse Architecture fundamentals including Databricks and Delta Lake
- ClickHouse
- DuckDB
Certifications
- Data Scientist with Python - DataCamp
- Machine Learning - Stanford University (Coursera)
- Big Data Modeling - UC San Diego
- DeepLearning.AI TensorFlow Developer
Experience
Jio Platforms Limited
Deputy General Manager - Data Engineering / BI Analytics
May 2016 - Present
- Built and operated enterprise data platform supporting 100PB+ of data across 800+ nodes, ensuring scalable, secure, and highly available infrastructure.
- Owned 100+ production data pipelines across analytics and regulatory domains, delivering reliable ETL, reporting, and downstream inputs.
- Led and mentored a team of 10 data engineers, setting objectives, running execution reviews, and driving on time delivery and operational excellence.
- Improved platform SLA from 65% to 87% (22% gain) through targeted Spark performance tuning, job optimization, and operational process changes.
- Designed and implemented large scale ingestion and transformation pipelines using PySpark, Hive, HDFS, and Hadoop ecosystem tools for structured, semi structured, and unstructured data.
- Defined enterprise data architecture and domain data models, standardizing schemas and metadata to enable cross functional analytics and reuse.
- Architected backup and historical retrieval systems for long term analytics, improving data accessibility and platform resilience.
- Partnered with business and regulatory stakeholders to translate requirements into actionable analytics, dashboards, and compliance reports.
- Established and enforced data governance and privacy controls, ensuring consent, lineage, and regulatory compliance across data flows.
- Managed application operations for Business Analytics and Regulatory teams, maintaining uninterrupted service for critical analytics applications.
IBM India Pvt Ltd
Senior Technical Services Specialist - BSS Operations
Dec 2012 - May 2016
- Managed telecom business service operations and SLA-driven issue resolution
- Handled customer complaints and service workflows across enterprise systems
- Supported reconciliation and business service operations
- Escalated critical issues and ensured service continuity
Earlier Experience
- Sistema Shyam Teleservices Limited - Senior Specialist
- Tech Mahindra Ltd. - Technical Associate
- IBM India Pvt Ltd - Operation Support Executive
- Ma Foi Management Consultants Ltd - FM Engineer
- Echo Mirror - Patent Research Analyst
Key Achievements
- Operated and scaled 100PB+ enterprise data platform
- Improved SLA from 65% to 87%
- Automated reporting workflows improving operational efficiency
- Received IBM Thanks Award and multiple performance recognitions
- Certificate of Recognition for Active Collaboration and Outstanding Performance in Technical Operations in Jio All hand Meet Dec 2020 & Jan 2026
Education
B.Tech - Electronics & Communication
2003 - 2007
- Bachelor of Technology in Electronics and Communication Engineering from Punjab Technical University, India.
Blog & Writing
Experimenting in Python: Building a React Better-Auth Inspired Authentication Library
Medium Article
Explores building an authentication library in Python inspired by modern frontend patterns like Better Auth. Focuses on system design, modular architecture, and bridging backend engineering with developer experience.