Operational analytics pipeline for diagnosing time leakage in IT service workflows.
This project simulates Jira/ServiceNow ticket lifecycles, computes operational metrics, identifies workflow bottlenecks, and estimates cost savings from process improvements.
- Detect workflow inefficiencies across teams and statuses
- Quantify operational time leakage in ticket lifecycles
- Simulate intervention strategies before implementation
- Generate executive-ready narratives from operational metrics
- Provide an interactive analytics dashboard via Streamlit
The pipeline transforms raw IT service ticket data into actionable operational insights.
Instead of manually analyzing ticket exports in spreadsheets, operations teams can:
• Detect workflow bottlenecks automatically
• Quantify time leakage across teams and priorities
• Estimate financial impact of delays
• Simulate process improvements before implementation
This enables faster operational decision making and measurable efficiency improvements in IT service organizations.
Time Leak Detector is an operational analytics pipeline designed to identify workflow inefficiencies in IT Service Management (ITSM) ticket lifecycles.
The system:
- Simulates realistic Jira / ServiceNow ticket data
- Computes time-based performance metrics
- Attributes operational leakage across teams and statuses
- Estimates financial impact of inefficiencies
- Enables scenario-based intervention modeling
The platform also includes an LLM-assisted executive narrative generator with structured fact validation that converts analytical results into executive-ready summaries.
The goal of this project is to demonstrate how structured analytics pipelines can surface operational inefficiencies and quantify savings opportunities in a repeatable automated workflow
Note: The dataset used in this project is synthetic and generated by synthetic_generator.py. ITSM operational data is rarely publicly available, so the generator simulates realistic Jira/ServiceNow ticket lifecycles including handoffs, waiting states, and resolution patterns.
IT operations teams frequently rely on manual exports and spreadsheet analysis to diagnose workflow inefficiencies.
Typical workflow today:
- Export tickets from Jira or ServiceNow
- Clean and normalize timestamp data
- Manually compute cycle time and queue delays
- Build pivot tables by team, category, or priority
- Estimate cost exposure from delays
This process is:
- time-consuming
- inconsistent
- difficult to scale
Time Leak Detector automates this workflow end-to-end.
This project shows how operational analytics can be used to:
- Diagnose workflow inefficiencies in IT service management systems
- Quantify operational time leakage across teams and processes
- Simulate operational interventions before implementation
- Translate operational metrics into financial impact estimates
The system combines:
- Data engineering pipelines
- Operational analytics
- Simulation modeling
- LLM-assisted reporting
into a single reproducible workflow.
Generates realistic ITSM ticket lifecycles including:
- priorities
- handoffs
- waiting states
- resolution patterns
Validates required columns, timestamp formats, and data types before processing.
Standardizes timestamps, identifiers, and canonical dataframe structures.
Computes operational metrics including:
- cycle time
- queue time (first-touch delay)
- idle time (waiting states)
- time-in-status breakdown
- handoff count
Ranks time leakage by:
- team
- ticket category
- priority
- workflow status
Runs scenario modeling such as:
- reducing queue time
- reducing idle time
- reducing handoffs
and estimates cost savings ranges.
Compares predicted vs observed improvements to measure simulation calibration accuracy.
Produces structured executive summaries from computed metrics with fact validation to prevent hallucinations.
Interactive interface for:
- overview metrics
- scoped team analysis
- intervention simulation
- executive summary generation
cycle_hoursqueue_hoursidle_hourshandoff_counttime_in_*_hours
- Total leakage hours by team
- Leakage distribution by priority
- Status-level bottlenecks
- Estimated annualized cost impact
Diagnosing operational inefficiencies in IT systems often requires manual data analysis.
Typical manual workflow:
- Export ticket and status data
- Clean timestamps
- Build spreadsheet pivots
- Identify bottlenecks
- Estimate cost impact
This typically requires 2–3 hours per week for a mid-sized operations team.
Using the Time Leak Detector pipeline:
Dataset size:
2,000 tickets
13,022 status events
Full pipeline runtime:
~4 seconds
Operational insights that previously required hours of manual analysis can be produced in seconds.
Using 2,000 simulated tickets:
- P3 tickets accounted for the majority of total leakage hours
- Tier1 handled the largest ticket volume and accumulated the highest leakage
IN_PROGRESSandTRIAGErepresented the largest share of lifecycle time
Scenario:
- Reduce queue delays by 20%
- Reduce idle time by 15%
- Reduce handoffs by 25%
Estimated savings:
| Scenario | Estimated Savings |
|---|---|
| Low | $924,849 |
| Base | $929,664 |
| High | $934,426 |
Simulation calibration:
- Predicted savings: $929,664
- Observed savings: $930,998
- Prediction error: −0.14%
In this simulated dataset of 2,000 tickets, the largest leakage concentrated in Tier1 handling high-volume P3 requests.
A recurring pattern shows tickets spending extended time in:
TRIAGE → IN_PROGRESS
Additional delays were introduced through:
- Tier1 → Tier2 handoffs
- waiting states between transitions
Using the simulator, we modeled an operational improvement plan:
- reduce queue delays by 20%
- reduce idle time by 15%
- reduce handoffs by 25%
The model estimates $924K–$934K in annualized savings.
The feedback loop shows stable prediction accuracy with ≈ −0.14% error vs observed results.
- Python
- Pandas
- NumPy
- Streamlit
- Pandera / Pydantic (schema validation)
- OpenAI API (structured narrative generation)
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python data/synthetic_generator.py
./run.sh
PYTHONPATH=. streamlit run app/streamlit_app.py
export OPENAI_API_KEY=your_key_here
• High-volume P3 tickets handled by Tier1 generated the largest share of total operational leakage.
• The workflow transition TRIAGE → IN_PROGRESS accounted for the largest average queue delay, indicating that tickets spend significant time waiting before active work begins.
• Tickets involving cross-team handoffs (Tier1 → Tier2) showed significantly longer cycle times, suggesting coordination overhead between operational teams.
These patterns suggest that operational improvements should focus on:
- Faster first-touch response times in TRIAGE
- Reducing unnecessary handoffs between teams
- Improving prioritization of high-volume P3 requests.
This project demonstrates:
- operational analytics design
- time-based workflow diagnostics
- cost impact modeling
- scenario-based intervention analysis
- controlled LLM integration with validation
- end-to-end analytics product thinking
It reflects a business analyst approach to diagnosing inefficiency, quantifying impact, and communicating findings in an executive-ready format.
