This project analyzes hospital operations to understand resource utilization and identify operational bottlenecks. Using Python for data cleaning, SQL for analysis, and Power BI for visualization, raw hospital data is transformed into actionable insights.
Hospitals manage a large number of patients daily. Inefficient use of resources such as beds, staff, and time can lead to overcrowding, longer stays, and poor service.
Key Question:
How efficiently is a hospital utilizing its resources, and where are operational bottlenecks occurring?
- Understand patient admissions and treatments
- Analyze patient length of stay
- Identify inefficiencies in hospital operations
- Present insights in clear visual dashboards
This project uses two hospital datasets:
| Dataset | Key Columns / Information |
|---|---|
| hospital_admissions.csv | Name, Age, Gender, Blood Type, Medical Condition, Admission Date, Doctor, Hospital, Discharge Date, and other relevant admission details |
| patient_stays.csv | Patient ID, Name, Age, Arrival Date, Departure Date, Service, Satisfaction, and other stay-related details |
Folder Structure for Datasets:
- Raw datasets:
raw_data/ - Cleaned datasets:
cleaned_data/
These datasets are used together to analyze hospital operations, patient stay patterns, and identify operational bottlenecks.
- Analyze the problem statement
- Identify key questions and metrics (admissions, stay duration, efficiency)
- Raw datasets stored in
raw_data/ - Data may contain missing values, duplicates, and formatting issues
- Cleaning tasks include:
- Handling missing values
- Correcting data types
- Removing duplicates
- Standardizing column names
- Jupyter Notebooks:
01_patient_stays_cleaning.ipynb02_hospital_admissions_cleaning.ipynb
- Cleaned datasets saved in
cleaned_data/
- SQL queries analyze operational patterns such as:
- Patient length of stay
- Admission trends
- High operational load areas
- SQL file:
hospital_operations_analysis.sql
- Dashboards visualize:
- Admission trends
- Length of stay patterns
- Operational bottlenecks
- Power BI dashboard exported as PDF (stored in
reports/) - Dashboard screenshot stored in
assets/dashboard.pngand displayed below:
- Combine SQL results and dashboard visuals
- Identify inefficiencies and document actionable insights
hospital-operations-efficiency-analysis/
│
├── raw_data/ # Original hospital datasets (hospital_admissions.csv, patient_stays.csv)
├── cleaned_data/ # Cleaned datasets
├── reports/ # Power BI dashboard PDF
├── assets/ # Screenshots of dashboards
├── sql/ # SQL queries (hospital_operations_analysis.sql)
│
├── 01_patient_stays_cleaning.ipynb
├── 02_hospital_admissions_cleaning.ipynb
├── problem_statement.txt
├── README.md
└── LICENSE
- Python (Pandas, NumPy) – Data cleaning
- SQL – Data querying and analysis
- Power BI – Dashboard visualization
- Jupyter Notebook – Exploratory data analysis
- Git & GitHub – Version control
- Patterns in patient admissions identified
- Hospital stay durations analyzed
- Operational bottlenecks highlighted
- Clear dashboards built for insights
- Clone the repository
- Open Jupyter Notebooks in
01_and02_to review data cleaning - Review SQL queries in the
sql/folder - View Power BI dashboard PDF in
reports/or screenshot inassets/
This project demonstrates how data analytics can improve hospital operations, reduce bottlenecks, and support informed decision-making using Python, SQL, and Power BI.
