Skip to content

g8rdier/qml-credit-risk-benchmark

Repository files navigation

Quantum vs Classical SVM Credit Risk Classification: Empirical Benchmark Study

Table of Contents


Academic Context

  • Course: Business Intelligence II, 6th Semester
  • Institution: IU International University of Applied Sciences
  • Supervisor: Dr. Stefan Nisch
  • Student: Gregor Kobilarov
  • Dataset: German Credit Risk Dataset (OpenML, n=1,000)
  • Primary Contribution: Production-ready QML benchmark with modern tooling (pixi) comparing quantum and classical SVM performance on structured financial data

Research Question

"To what extent can Quantum Machine Learning (QML) approaches, specifically Quantum Support Vector Machines (QSVM), deliver comparable or better classification results on structured financial data than classical methods today?"

Hypothesis

Null Hypotheses (H0)

H0₁ (Performance): There is no significant difference in classification performance (F1-score) between Quantum SVM and Classical SVM on the German Credit Risk dataset.

  • Formally: μ_F1(QSVM) = μ_F1(Classical SVM)

H0₂ (Computational Efficiency): Quantum SVM requires equal or less computational time compared to Classical SVM for training and prediction.

  • Formally: T_total(QSVM) ≤ T_total(Classical SVM)

Alternative Hypotheses (H1)

H1₁ (Performance): Quantum SVM achieves significantly different classification performance compared to Classical SVM.

  • Formally: μ_F1(QSVM) ≠ μ_F1(Classical SVM)

H1₂ (Computational Efficiency): Quantum SVM requires significantly more computational time than Classical SVM due to quantum state simulation overhead.

  • Formally: T_total(QSVM) > T_total(Classical SVM)

Expected Outcome

QSVM achieves similar accuracy in high-dimensional quantum feature spaces but requires exponentially more computational time in simulation due to quantum state vector simulation overhead (2^n complexity).

Dataset Characteristics

German Credit Risk Dataset

  • Source: OpenML (credit-g, dataset version 1)
  • Samples: 1,000 credit applications
  • Features: 20 attributes (7 numeric, 13 categorical)
  • Target: Binary classification (Good Credit: 700, Bad Credit: 300)
  • Task: Predict creditworthiness based on applicant attributes

Project Architecture

qml-credit-risk-benchmark/
├── src/
│   ├── __init__.py
│   ├── data_loader.py          # Data loading from OpenML/CSV
│   ├── preprocessing.py        # Cleaning, encoding, scaling, PCA
│   ├── classical_svm.py        # Classical SVM implementation
│   └── quantum_svm.py          # QSVM implementation
├── data/
│   ├── raw/                    # Raw data files
│   └── processed/              # Preprocessed data
├── models/                     # Saved models and preprocessors
├── results/                    # Plots and result files
├── notebooks/                  # Jupyter notebooks for exploration
├── main.py                     # Main execution script
├── pixi.toml                   # Pixi dependency configuration
├── pixi.lock                   # Locked dependency versions
└── README.md

Key Features

Modular Design

  • Data Loader: Fetches data from OpenML or loads from CSV
  • Preprocessor: Handles missing values, encoding, scaling, and PCA
  • Classical SVM: Scikit-learn based with multiple kernel options
  • Quantum SVM: Qiskit-based quantum kernel with caching support

Critical Pre-processing Pipeline

  1. Missing Value Handling

    • Numeric: Median imputation
    • Categorical: Mode imputation
  2. Categorical Encoding

    • One-hot encoding with drop_first=True
  3. Feature Scaling

    • StandardScaler (critical for SVM performance)
  4. Dimensionality Reduction (PCA)

    • Reduces features to match available qubits
    • Default: 4 components (4-qubit QSVM)
    • Configurable: 2-20 components

Why PCA is Critical:

  • Quantum simulators are limited by qubit count
  • Each feature requires 1 qubit in quantum feature map
  • PCA preserves maximum variance while reducing dimensions

Installation

Prerequisites

  • pixi package manager (recommended)
  • OR Python 3.11+ with pip (alternative)

Setup with Pixi (Recommended)

# Install pixi if not already installed
curl -fsSL https://pixi.sh/install.sh | bash

# Clone the repository
git clone <repository-url>
cd qml-credit-risk-benchmark

# Install all dependencies automatically
pixi install

# Run commands using pixi
pixi run python main.py --mode classical

Why pixi? Pixi provides reproducible dependency management, cross-platform compatibility, and automatic environment handling without manual virtual environment setup.

Alternative Setup (pip)

# Clone the repository
git clone <repository-url>
cd qml-credit-risk-benchmark

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies manually
pip install scikit-learn qiskit qiskit-machine-learning pandas numpy matplotlib seaborn

Usage

Quick Start

# Run classical SVM with default settings (4 PCA components)
pixi run python main.py --mode classical

# Run quantum SVM with 4 qubits (full dataset)
pixi run python main.py --mode quantum --n-components 4

# Compare classical vs quantum (full analysis)
pixi run python main.py --mode compare --n-components 4

# Compare different classical kernel types
pixi run python main.py --mode classical --compare-kernels

Scalability Testing (Subset Mode)

For testing with higher qubit counts where full dataset simulation is infeasible:

# Test 8-qubit quantum circuit with reduced dataset
pixi run python main.py --mode quantum --n-components 8 --subset-size 200

# Compare classical vs quantum with subset (stratified sampling)
pixi run python main.py --mode compare --n-components 8 --subset-size 250

The --subset-size parameter enables stratified subsampling while preserving class distribution. This is useful for proof-of-concept experiments with higher dimensional quantum circuits that would otherwise cause computational infeasibility on consumer hardware.

Interactive Exploration (Jupyter Notebook)

For interactive data exploration and classical SVM experimentation:

# Launch Jupyter notebook with pixi
pixi run jupyter notebook notebooks/01_classical_svm_exploration.ipynb

# Alternative: open directly in VS Code with the Jupyter extension
code notebooks/01_classical_svm_exploration.ipynb

What the notebook provides:

  • Interactive data visualization and PCA analysis
  • Kernel comparison experiments (RBF, linear, poly)
  • Hyperparameter tuning (C values, component counts)
  • Step-by-step walkthrough of the preprocessing pipeline
  • Real-time plotting of confusion matrices, ROC curves, and performance metrics

When to use it:

  • Exploring the dataset characteristics before running experiments
  • Testing different preprocessing configurations interactively
  • Understanding how PCA component selection affects model performance
  • Experimenting with classical SVM kernels without waiting for full pipeline runs

Advanced Usage

Using Individual Modules

Data Loading:

from src.data_loader import load_credit_data

# Load from OpenML
X, y = load_credit_data("openml")

# Load from CSV
X, y = load_credit_data("path/to/data.csv")

Preprocessing:

from src.preprocessing import CreditDataPreprocessor

preprocessor = CreditDataPreprocessor(n_components=4)
X_train, X_test, y_train, y_test = preprocessor.preprocess_data(X, y)

# Save preprocessor for later use
preprocessor.save_preprocessor("models/preprocessor.pkl")

Classical SVM:

from src.classical_svm import ClassicalSVM

# Train model
svm = ClassicalSVM(kernel='rbf', C=1.0)
svm.train(X_train, y_train)

# Evaluate
metrics = svm.evaluate(X_test, y_test)

# Generate visualizations
svm.plot_confusion_matrix(X_test, y_test)
svm.plot_roc_curve(X_test, y_test)

# Save model
svm.save_model("models/classical_svm.pkl")

Glossary for Beginners

If you're new to machine learning or quantum computing, here are the key terms explained:

Machine Learning Concepts

Classification

  • Task of predicting which category something belongs to (e.g., "good credit" vs "bad credit")
  • The model learns patterns from labeled examples (training data) and applies them to new cases

Support Vector Machine (SVM)

  • A classification algorithm that finds the best boundary (hyperplane) to separate different categories
  • Works by maximizing the margin (distance) between the boundary and the nearest data points from each class
  • Can handle non-linear patterns using "kernel tricks"

Kernel

  • A mathematical function that transforms data into a higher-dimensional space
  • Allows SVMs to find complex, non-linear decision boundaries
  • Common kernels: Linear (straight line), RBF (curved boundary), Polynomial (curved with specific shape)

Training vs Testing

  • Training data: Examples the model learns from (80% of dataset in this project)
  • Testing data: Examples used to evaluate the model's performance on unseen data (20% of dataset)
  • This split ensures the model can generalize, not just memorize

Feature

  • An individual measurable property used for prediction (e.g., age, income, loan amount)
  • Original dataset has 20 features; we reduce to 4 using PCA for quantum compatibility

Principal Component Analysis (PCA)

  • A technique to reduce the number of features while keeping the most important information
  • Combines correlated features into fewer "principal components"
  • Example: Instead of tracking "height" and "weight" separately, create a single "size" component

Performance Metrics Explained

Confusion Matrix Terms:

  • True Positive (TP): Correctly predicted "good credit"
  • True Negative (TN): Correctly predicted "bad credit"
  • False Positive (FP): Predicted "good" but actually "bad" (approved a risky loan)
  • False Negative (FN): Predicted "bad" but actually "good" (rejected a safe loan)

Accuracy

  • Formula: (TP + TN) / Total predictions
  • What it means: Percentage of all predictions that were correct
  • Limitation: Can be misleading with imbalanced datasets (e.g., if 90% are "good credit", predicting "good" for everything gives 90% accuracy)

Precision

  • Formula: TP / (TP + FP)
  • What it means: Of all loans we approved, what percentage were actually good?
  • High precision = Few false positives = Conservative lending (reject doubtful cases)

Recall

  • Formula: TP / (TP + FN)
  • What it means: Of all actual good credits, what percentage did we correctly identify?
  • High recall = Few false negatives = Aggressive lending (approve most cases)

F1-Score

  • Formula: 2 × (Precision × Recall) / (Precision + Recall)
  • What it means: Balanced metric that considers both precision and recall
  • Useful when you care equally about false positives and false negatives
  • Range: 0 (worst) to 1 (perfect)

ROC AUC (Area Under Curve)

  • Measures the model's ability to distinguish between classes across all threshold settings
  • Range: 0.5 (random guessing) to 1.0 (perfect classification)
  • Higher is better

Quantum Computing Concepts

Qubit

  • The quantum equivalent of a classical bit
  • Unlike classical bits (0 or 1), qubits can be in superposition (both 0 and 1 simultaneously)
  • This allows quantum computers to explore multiple possibilities at once

Quantum Circuit

  • A sequence of quantum operations (gates) applied to qubits
  • Analogous to a classical computer program but for quantum hardware
  • In this project, circuits encode credit risk data into quantum states

Quantum Feature Map

  • Encodes classical data (credit features) into quantum states
  • Creates a high-dimensional quantum representation of the data
  • Allows quantum algorithms to find patterns classical algorithms might miss

Quantum Kernel

  • Measures similarity between data points in quantum feature space
  • Computed by running quantum circuits and measuring overlap between quantum states
  • Replaces classical kernel computation in quantum SVM

Quantum Simulation

  • Running quantum algorithms on classical computers by explicitly tracking all quantum states
  • Exponentially expensive: 4 qubits = 16 states, 8 qubits = 256 states, 20 qubits = 1 million states
  • Why real quantum hardware is needed for practical applications

Hilbert Space

  • The mathematical space where quantum states exist
  • Exponentially large compared to classical state space
  • Quantum advantage comes from exploring this massive space efficiently

This Project's Approach

Classical SVM: Uses traditional RBF kernel on 4 PCA-reduced features

  • Fast (0.04 seconds training)
  • Well-understood and proven
  • Good baseline performance

Quantum SVM: Uses quantum kernel with 4-qubit quantum circuits

  • Slow in simulation (396 seconds training)
  • Explores quantum feature space
  • Marginal performance improvement in this experiment

The Comparison: Tests whether quantum provides practical advantages for credit risk classification on current (simulated) quantum hardware.

Evaluation Metrics

The project tracks the following metrics for comparison:

Metric Description Importance
Accuracy Overall correctness Primary metric
Precision Positive predictive value Important for credit risk
Recall True positive rate Critical for identifying good credits
F1-Score Harmonic mean of precision/recall Balanced performance
ROC AUC Area under ROC curve Model discrimination ability
Training Time Time to fit model Computational cost
Prediction Time Time for inference Deployment feasibility

Experimental Results

Performance Comparison

Metric Classical SVM Quantum SVM Winner
Accuracy 70.00% 70.50% Quantum (+0.5%)
Precision 75.00% 70.77% Classical
Recall 85.71% 98.57% Quantum
F1-Score 80.00% 82.39% Quantum (+2.4%)

Computational Efficiency

Operation Classical SVM Quantum SVM Speedup
Training 0.041s 385.95s Classical 9,413x faster
Prediction 0.003s 252.97s Classical 81,603x faster
Total Time 0.044s 638.92s Classical 14,498x faster

Methodology Note: Quantum timing results reflect first-run performance without kernel caching. The quantum implementation includes a caching mechanism for kernel matrices (stored in data/processed/), which can speed up repeated experiments with identical parameters. However, all reported benchmarks use fresh kernel computation to ensure fair comparison with classical methods and represent realistic first-run performance.

Key Findings

Hypothesis Testing Results:

  • H0₁ (Performance): REJECTED - Quantum achieves marginally better F1-score (0.8239 vs 0.8000, +2.99% improvement), though difference is small and may not be statistically significant without repeated trials
  • H0₂ (Computational Efficiency): REJECTED - Quantum is 14,498x slower (638.92s vs 0.044s), strongly supporting H1₂
  • Overall: Expected outcome validated - similar accuracy (~0.5% difference) but exponentially higher computational cost

Detailed Results:

  • Performance: Quantum achieves marginally better F1-score (2.99% improvement)
  • Accuracy: Near-identical performance validates hypothesis (~0.5% difference)
  • Computational Cost: Quantum is 14,498x slower due to simulation overhead
  • Practical Conclusion: Quantum simulation provides no practical advantage for production use

Trade-offs:

  • Quantum: Exceptional recall (98.57%) - catches almost all good credits but with more false positives
  • Classical: Higher precision (75.00%) - more conservative, fewer false positives

Scalability Analysis

8-Qubit Limitation (Exponential Barrier):

Attempts to scale to 8 qubits revealed fundamental computational limits of classical quantum simulation:

  • State Vector Complexity: 2^8 = 256 complex amplitudes per quantum state
  • Kernel Matrix Computation: 800×800 = 640,000 quantum circuit simulations required
  • Resource Exhaustion: System freeze after >60 minutes on consumer hardware (Intel i5, 32GB RAM)
  • Subset Requirement: Even with stratified subsampling (n=200, reducing to 25,600 simulations), runtime exceeded feasibility threshold

Scientific Implication:

This empirical barrier confirms the exponential scaling problem of classical quantum simulation and demonstrates why real quantum hardware is necessary for practical QML applications beyond proof-of-concept demonstrations. The --subset-size parameter was implemented to enable controlled experiments, but fundamental physics limits classical simulation regardless of engineering optimizations.

Visualizations

Comprehensive Comparison Summary

Comparison Summary

The comprehensive comparison includes:

  • Performance metrics bar chart
  • Computational efficiency comparison (log scale)
  • Performance heatmap
  • Summary analysis for Business Intelligence II project

ROC Curve Comparison

ROC Curve Comparison

The ROC (Receiver Operating Characteristic) curve comparison shows:

  • Classical SVM: Better AUC (Area Under Curve) indicating superior discrimination ability
  • Quantum SVM: Lower AUC due to extreme recall-precision trade-off
  • Both curves significantly outperform the random classifier baseline

Precision-Recall Curve Comparison

Precision-Recall Curve

The Precision-Recall curve reveals different operating characteristics:

  • Classical SVM: More balanced precision-recall trade-off
  • Quantum SVM: Higher recall at the cost of precision
  • Baseline shows the no-skill classifier (class proportion)

Deep Error Analysis

Error Analysis

Error Pattern Analysis reveals critical business insights:

  • 90% reduction in false negatives (20 → 2 bad credits approved)
  • Hypothetical business impact: Using industry-typical assumptions (€10k avg loan, 80% default loss rate, 5% opportunity cost), this error reduction translates to ~€135k cost savings per 200 applications
  • Trade-off: 43% increase in false positives (more conservative lending)
  • Risk profile comparison: Quantum optimizes for recall (minimizing missed good credits), Classical balances precision/recall

This visualization demonstrates that quantum SVM isn't just marginally better - it has a fundamentally different error profile suitable for risk-averse institutions.

Note: Business impact figures are illustrative examples using representative industry parameters, not actual financial data from the dataset.

Technical Notes

PCA Component Selection

Components Explained Variance Use Case
2 ~40-50% Minimal quantum circuit
4 ~60-70% Balanced (recommended)
8 ~80-90% Maximum information retention
16+ ~95%+ Near-original performance

Kernel Comparison

Linear Kernel:

  • Fast, interpretable
  • Good for linearly separable data
  • Lower computational cost

RBF Kernel:

  • Most flexible
  • Good default choice
  • Handles non-linear patterns

Polynomial Kernel:

  • Captures specific feature interactions
  • Can overfit with high degree

Quantum Kernel:

  • Uses quantum feature map
  • Explores exponentially large Hilbert space
  • Computationally expensive in simulation

Troubleshooting

Common Issues

Issue: ModuleNotFoundError: No module named 'sklearn' or similar dependency errors Solution: Ensure you're using pixi: pixi install or manually install dependencies with pip

Issue: Memory error during PCA Solution: Reduce n_components or use incremental PCA

Issue: Poor model performance Solution: Try different kernels with --compare-kernels flag

Issue: Quantum implementation not working Solution: Verify Qiskit installation: pixi list | grep qiskit or reinstall with pixi install

Issue: System freezes or becomes unresponsive with high qubit counts Solution: Use --subset-size parameter to reduce dataset size. Example: --subset-size 200 for 8+ qubits. Note that classical quantum simulation has fundamental exponential scaling limits.

Development

Running Analysis Scripts

Generate thesis-ready analysis and visualizations:

# Run comprehensive analysis (confusion matrix, PCA, business impact)
pixi run python analysis.py

# Generate error analysis visualization
pixi run python create_error_analysis_plot.py

# Regenerate visualizations from cached metrics (without re-simulation)
pixi run python regenerate_visualizations.py

Output files:

  • results/thesis_summary_table.csv - Ready for thesis tables
  • results/confusion_matrix_comparison.csv - Detailed error breakdown
  • results/error_analysis_comprehensive.png - Publication-quality visualization
  • results/comparison_metrics.json - Cached metrics for visualizations

Note: The regenerate_visualizations.py script loads metrics from results/comparison_metrics.json and regenerates visualizations without re-running the time-consuming quantum simulation. Useful for adjusting plot aesthetics or text.

Running Tests

# Test individual modules
pixi run python src/data_loader.py
pixi run python src/preprocessing.py
pixi run python src/classical_svm.py

Code Style

  • Type hints for all function parameters
  • Docstrings in Google style
  • English comments
  • PEP 8 compliant

References

Author

Gregor Kobilarov

License

This project is licensed under the MIT License - see the LICENSE file for details.

This project was created for educational purposes as part of a university course.

About

Empirical comparison of Quantum vs Classical SVMs on credit risk data (University Project)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors