Code examples from the article Efficient Data Processing in Python: Batch vs Streaming Pipelines Explained.
batch_pipeline.py— ETL pipeline that reads a daily order CSV, aggregates revenue by region, and writes to a destination filestreaming_pipeline.py— Generator-based pipeline that validates and enriches order events one at a time as they arrive
pandas
Batch:
python batch_pipeline.pyExpects orders_2024_06_01.csv with columns: order_id, order_timestamp, status, region, order_value_gbp.
Streaming:
python streaming_pipeline.pyExpects order_events.jsonl — one JSON object per line with fields: order_id, customer_id, order_value_gbp, region.