Quantum vs Classical SVM Credit Risk Classification: Empirical Benchmark Study

Academic Context

Course: Business Intelligence II, 6th Semester
Institution: IU International University of Applied Sciences
Supervisor: Dr. Stefan Nisch
Student: Gregor Kobilarov
Dataset: German Credit Risk Dataset (OpenML, n=1,000)
Primary Contribution: Production-ready QML benchmark with modern tooling (pixi) comparing quantum and classical SVM performance on structured financial data

Research Question

"To what extent can Quantum Machine Learning (QML) approaches, specifically Quantum Support Vector Machines (QSVM), deliver comparable or better classification results on structured financial data than classical methods today?"

Hypothesis

Null Hypotheses (H0)

H0₁ (Performance): There is no significant difference in classification performance (F1-score) between Quantum SVM and Classical SVM on the German Credit Risk dataset.

Formally: μ_F1(QSVM) = μ_F1(Classical SVM)

H0₂ (Computational Efficiency): Quantum SVM requires equal or less computational time compared to Classical SVM for training and prediction.

Formally: T_total(QSVM) ≤ T_total(Classical SVM)

Alternative Hypotheses (H1)

H1₁ (Performance): Quantum SVM achieves significantly different classification performance compared to Classical SVM.

Formally: μ_F1(QSVM) ≠ μ_F1(Classical SVM)

H1₂ (Computational Efficiency): Quantum SVM requires significantly more computational time than Classical SVM due to quantum state simulation overhead.

Formally: T_total(QSVM) > T_total(Classical SVM)

Expected Outcome

QSVM achieves similar accuracy in high-dimensional quantum feature spaces but requires exponentially more computational time in simulation due to quantum state vector simulation overhead (2^n complexity).

Dataset Characteristics

German Credit Risk Dataset

Source: OpenML (credit-g, dataset version 1)
Samples: 1,000 credit applications
Features: 20 attributes (7 numeric, 13 categorical)
Target: Binary classification (Good Credit: 700, Bad Credit: 300)
Task: Predict creditworthiness based on applicant attributes

Project Architecture

qml-credit-risk-benchmark/
├── src/
│   ├── __init__.py
│   ├── data_loader.py          # Data loading from OpenML/CSV
│   ├── preprocessing.py        # Cleaning, encoding, scaling, PCA
│   ├── classical_svm.py        # Classical SVM implementation
│   └── quantum_svm.py          # QSVM implementation
├── data/
│   ├── raw/                    # Raw data files
│   └── processed/              # Preprocessed data
├── models/                     # Saved models and preprocessors
├── results/                    # Plots and result files
├── notebooks/                  # Jupyter notebooks for exploration
├── main.py                     # Main execution script
├── pixi.toml                   # Pixi dependency configuration
├── pixi.lock                   # Locked dependency versions
└── README.md

Key Features

Modular Design

Data Loader: Fetches data from OpenML or loads from CSV
Preprocessor: Handles missing values, encoding, scaling, and PCA
Classical SVM: Scikit-learn based with multiple kernel options
Quantum SVM: Qiskit-based quantum kernel with caching support

Critical Pre-processing Pipeline

Missing Value Handling
- Numeric: Median imputation
- Categorical: Mode imputation
Categorical Encoding
- One-hot encoding with drop_first=True
Feature Scaling
- StandardScaler (critical for SVM performance)
Dimensionality Reduction (PCA)
- Reduces features to match available qubits
- Default: 4 components (4-qubit QSVM)
- Configurable: 2-20 components

Why PCA is Critical:

Quantum simulators are limited by qubit count
Each feature requires 1 qubit in quantum feature map
PCA preserves maximum variance while reducing dimensions

Installation

Prerequisites

pixi package manager (recommended)
OR Python 3.11+ with pip (alternative)

Setup with Pixi (Recommended)

# Install pixi if not already installed
curl -fsSL https://pixi.sh/install.sh | bash

# Clone the repository
git clone <repository-url>
cd qml-credit-risk-benchmark

# Install all dependencies automatically
pixi install

# Run commands using pixi
pixi run python main.py --mode classical

Why pixi? Pixi provides reproducible dependency management, cross-platform compatibility, and automatic environment handling without manual virtual environment setup.

Alternative Setup (pip)

# Clone the repository
git clone <repository-url>
cd qml-credit-risk-benchmark

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies manually
pip install scikit-learn qiskit qiskit-machine-learning pandas numpy matplotlib seaborn

Usage

Quick Start

# Run classical SVM with default settings (4 PCA components)
pixi run python main.py --mode classical

# Run quantum SVM with 4 qubits (full dataset)
pixi run python main.py --mode quantum --n-components 4

# Compare classical vs quantum (full analysis)
pixi run python main.py --mode compare --n-components 4

# Compare different classical kernel types
pixi run python main.py --mode classical --compare-kernels

Scalability Testing (Subset Mode)

For testing with higher qubit counts where full dataset simulation is infeasible:

# Test 8-qubit quantum circuit with reduced dataset
pixi run python main.py --mode quantum --n-components 8 --subset-size 200

# Compare classical vs quantum with subset (stratified sampling)
pixi run python main.py --mode compare --n-components 8 --subset-size 250

The --subset-size parameter enables stratified subsampling while preserving class distribution. This is useful for proof-of-concept experiments with higher dimensional quantum circuits that would otherwise cause computational infeasibility on consumer hardware.

Interactive Exploration (Jupyter Notebook)

For interactive data exploration and classical SVM experimentation:

# Launch Jupyter notebook with pixi
pixi run jupyter notebook notebooks/01_classical_svm_exploration.ipynb

# Alternative: open directly in VS Code with the Jupyter extension
code notebooks/01_classical_svm_exploration.ipynb

What the notebook provides:

Interactive data visualization and PCA analysis
Kernel comparison experiments (RBF, linear, poly)
Hyperparameter tuning (C values, component counts)
Step-by-step walkthrough of the preprocessing pipeline
Real-time plotting of confusion matrices, ROC curves, and performance metrics

When to use it:

Exploring the dataset characteristics before running experiments
Testing different preprocessing configurations interactively
Understanding how PCA component selection affects model performance
Experimenting with classical SVM kernels without waiting for full pipeline runs

Advanced Usage

Using Individual Modules

Data Loading:

from src.data_loader import load_credit_data

# Load from OpenML
X, y = load_credit_data("openml")

# Load from CSV
X, y = load_credit_data("path/to/data.csv")

Preprocessing:

from src.preprocessing import CreditDataPreprocessor

preprocessor = CreditDataPreprocessor(n_components=4)
X_train, X_test, y_train, y_test = preprocessor.preprocess_data(X, y)

# Save preprocessor for later use
preprocessor.save_preprocessor("models/preprocessor.pkl")

Classical SVM:

from src.classical_svm import ClassicalSVM

# Train model
svm = ClassicalSVM(kernel='rbf', C=1.0)
svm.train(X_train, y_train)

# Evaluate
metrics = svm.evaluate(X_test, y_test)

# Generate visualizations
svm.plot_confusion_matrix(X_test, y_test)
svm.plot_roc_curve(X_test, y_test)

# Save model
svm.save_model("models/classical_svm.pkl")

Glossary for Beginners

If you're new to machine learning or quantum computing, here are the key terms explained:

Machine Learning Concepts

Classification

Task of predicting which category something belongs to (e.g., "good credit" vs "bad credit")
The model learns patterns from labeled examples (training data) and applies them to new cases

Support Vector Machine (SVM)

A classification algorithm that finds the best boundary (hyperplane) to separate different categories
Works by maximizing the margin (distance) between the boundary and the nearest data points from each class
Can handle non-linear patterns using "kernel tricks"

Kernel

A mathematical function that transforms data into a higher-dimensional space
Allows SVMs to find complex, non-linear decision boundaries
Common kernels: Linear (straight line), RBF (curved boundary), Polynomial (curved with specific shape)

Training vs Testing

Training data: Examples the model learns from (80% of dataset in this project)
Testing data: Examples used to evaluate the model's performance on unseen data (20% of dataset)
This split ensures the model can generalize, not just memorize

Feature

An individual measurable property used for prediction (e.g., age, income, loan amount)
Original dataset has 20 features; we reduce to 4 using PCA for quantum compatibility

Principal Component Analysis (PCA)

A technique to reduce the number of features while keeping the most important information
Combines correlated features into fewer "principal components"
Example: Instead of tracking "height" and "weight" separately, create a single "size" component

Performance Metrics Explained

Confusion Matrix Terms:

True Positive (TP): Correctly predicted "good credit"
True Negative (TN): Correctly predicted "bad credit"
False Positive (FP): Predicted "good" but actually "bad" (approved a risky loan)
False Negative (FN): Predicted "bad" but actually "good" (rejected a safe loan)

Accuracy

Formula: (TP + TN) / Total predictions
What it means: Percentage of all predictions that were correct
Limitation: Can be misleading with imbalanced datasets (e.g., if 90% are "good credit", predicting "good" for everything gives 90% accuracy)

Precision

Formula: TP / (TP + FP)
What it means: Of all loans we approved, what percentage were actually good?
High precision = Few false positives = Conservative lending (reject doubtful cases)

Recall

Formula: TP / (TP + FN)
What it means: Of all actual good credits, what percentage did we correctly identify?
High recall = Few false negatives = Aggressive lending (approve most cases)

F1-Score

Formula: 2 × (Precision × Recall) / (Precision + Recall)
What it means: Balanced metric that considers both precision and recall
Useful when you care equally about false positives and false negatives
Range: 0 (worst) to 1 (perfect)

ROC AUC (Area Under Curve)

Measures the model's ability to distinguish between classes across all threshold settings
Range: 0.5 (random guessing) to 1.0 (perfect classification)
Higher is better

Quantum Computing Concepts

Qubit

The quantum equivalent of a classical bit
Unlike classical bits (0 or 1), qubits can be in superposition (both 0 and 1 simultaneously)
This allows quantum computers to explore multiple possibilities at once

Quantum Circuit

A sequence of quantum operations (gates) applied to qubits
Analogous to a classical computer program but for quantum hardware
In this project, circuits encode credit risk data into quantum states

Quantum Feature Map

Encodes classical data (credit features) into quantum states
Creates a high-dimensional quantum representation of the data
Allows quantum algorithms to find patterns classical algorithms might miss

Quantum Kernel

Measures similarity between data points in quantum feature space
Computed by running quantum circuits and measuring overlap between quantum states
Replaces classical kernel computation in quantum SVM

Quantum Simulation

Running quantum algorithms on classical computers by explicitly tracking all quantum states
Exponentially expensive: 4 qubits = 16 states, 8 qubits = 256 states, 20 qubits = 1 million states
Why real quantum hardware is needed for practical applications

Hilbert Space

The mathematical space where quantum states exist
Exponentially large compared to classical state space
Quantum advantage comes from exploring this massive space efficiently

This Project's Approach

Classical SVM: Uses traditional RBF kernel on 4 PCA-reduced features

Fast (0.04 seconds training)
Well-understood and proven
Good baseline performance

Quantum SVM: Uses quantum kernel with 4-qubit quantum circuits

Slow in simulation (396 seconds training)
Explores quantum feature space
Marginal performance improvement in this experiment

The Comparison: Tests whether quantum provides practical advantages for credit risk classification on current (simulated) quantum hardware.

Evaluation Metrics

The project tracks the following metrics for comparison:

Metric	Description	Importance
Accuracy	Overall correctness	Primary metric
Precision	Positive predictive value	Important for credit risk
Recall	True positive rate	Critical for identifying good credits
F1-Score	Harmonic mean of precision/recall	Balanced performance
ROC AUC	Area under ROC curve	Model discrimination ability
Training Time	Time to fit model	Computational cost
Prediction Time	Time for inference	Deployment feasibility

Experimental Results

Performance Comparison

Metric	Classical SVM	Quantum SVM	Winner
Accuracy	70.00%	70.50%	Quantum (+0.5%)
Precision	75.00%	70.77%	Classical
Recall	85.71%	98.57%	Quantum
F1-Score	80.00%	82.39%	Quantum (+2.4%)

Computational Efficiency

Operation	Classical SVM	Quantum SVM	Speedup
Training	0.041s	385.95s	Classical 9,413x faster
Prediction	0.003s	252.97s	Classical 81,603x faster
Total Time	0.044s	638.92s	Classical 14,498x faster

Methodology Note: Quantum timing results reflect first-run performance without kernel caching. The quantum implementation includes a caching mechanism for kernel matrices (stored in data/processed/), which can speed up repeated experiments with identical parameters. However, all reported benchmarks use fresh kernel computation to ensure fair comparison with classical methods and represent realistic first-run performance.

Key Findings

Hypothesis Testing Results:

H0₁ (Performance): REJECTED - Quantum achieves marginally better F1-score (0.8239 vs 0.8000, +2.99% improvement), though difference is small and may not be statistically significant without repeated trials
H0₂ (Computational Efficiency): REJECTED - Quantum is 14,498x slower (638.92s vs 0.044s), strongly supporting H1₂
Overall: Expected outcome validated - similar accuracy (~0.5% difference) but exponentially higher computational cost

Detailed Results:

Performance: Quantum achieves marginally better F1-score (2.99% improvement)
Accuracy: Near-identical performance validates hypothesis (~0.5% difference)
Computational Cost: Quantum is 14,498x slower due to simulation overhead
Practical Conclusion: Quantum simulation provides no practical advantage for production use

Trade-offs:

Quantum: Exceptional recall (98.57%) - catches almost all good credits but with more false positives
Classical: Higher precision (75.00%) - more conservative, fewer false positives

Scalability Analysis

8-Qubit Limitation (Exponential Barrier):

Attempts to scale to 8 qubits revealed fundamental computational limits of classical quantum simulation:

State Vector Complexity: 2^8 = 256 complex amplitudes per quantum state
Kernel Matrix Computation: 800×800 = 640,000 quantum circuit simulations required
Resource Exhaustion: System freeze after >60 minutes on consumer hardware (Intel i5, 32GB RAM)
Subset Requirement: Even with stratified subsampling (n=200, reducing to 25,600 simulations), runtime exceeded feasibility threshold

Scientific Implication:

This empirical barrier confirms the exponential scaling problem of classical quantum simulation and demonstrates why real quantum hardware is necessary for practical QML applications beyond proof-of-concept demonstrations. The --subset-size parameter was implemented to enable controlled experiments, but fundamental physics limits classical simulation regardless of engineering optimizations.

Visualizations

Comprehensive Comparison Summary

The comprehensive comparison includes:

Performance metrics bar chart
Computational efficiency comparison (log scale)
Performance heatmap
Summary analysis for Business Intelligence II project

ROC Curve Comparison

The ROC (Receiver Operating Characteristic) curve comparison shows:

Classical SVM: Better AUC (Area Under Curve) indicating superior discrimination ability
Quantum SVM: Lower AUC due to extreme recall-precision trade-off
Both curves significantly outperform the random classifier baseline

Precision-Recall Curve Comparison

The Precision-Recall curve reveals different operating characteristics:

Classical SVM: More balanced precision-recall trade-off
Quantum SVM: Higher recall at the cost of precision
Baseline shows the no-skill classifier (class proportion)

Deep Error Analysis

Error Pattern Analysis reveals critical business insights:

90% reduction in false negatives (20 → 2 bad credits approved)
Hypothetical business impact: Using industry-typical assumptions (€10k avg loan, 80% default loss rate, 5% opportunity cost), this error reduction translates to ~€135k cost savings per 200 applications
Trade-off: 43% increase in false positives (more conservative lending)
Risk profile comparison: Quantum optimizes for recall (minimizing missed good credits), Classical balances precision/recall

This visualization demonstrates that quantum SVM isn't just marginally better - it has a fundamentally different error profile suitable for risk-averse institutions.

Note: Business impact figures are illustrative examples using representative industry parameters, not actual financial data from the dataset.

Technical Notes

PCA Component Selection

Components	Explained Variance	Use Case
2	~40-50%	Minimal quantum circuit
4	~60-70%	Balanced (recommended)
8	~80-90%	Maximum information retention
16+	~95%+	Near-original performance

Kernel Comparison

Linear Kernel:

Fast, interpretable
Good for linearly separable data
Lower computational cost

RBF Kernel:

Most flexible
Good default choice
Handles non-linear patterns

Polynomial Kernel:

Captures specific feature interactions
Can overfit with high degree

Quantum Kernel:

Uses quantum feature map
Explores exponentially large Hilbert space
Computationally expensive in simulation

Troubleshooting

Common Issues

Issue: ModuleNotFoundError: No module named 'sklearn' or similar dependency errors Solution: Ensure you're using pixi: pixi install or manually install dependencies with pip

Issue: Memory error during PCA Solution: Reduce n_components or use incremental PCA

Issue: Poor model performance Solution: Try different kernels with --compare-kernels flag

Issue: Quantum implementation not working Solution: Verify Qiskit installation: pixi list | grep qiskit or reinstall with pixi install

Issue: System freezes or becomes unresponsive with high qubit counts Solution: Use --subset-size parameter to reduce dataset size. Example: --subset-size 200 for 8+ qubits. Note that classical quantum simulation has fundamental exponential scaling limits.

Development

Running Analysis Scripts

Generate thesis-ready analysis and visualizations:

# Run comprehensive analysis (confusion matrix, PCA, business impact)
pixi run python analysis.py

# Generate error analysis visualization
pixi run python create_error_analysis_plot.py

# Regenerate visualizations from cached metrics (without re-simulation)
pixi run python regenerate_visualizations.py

Output files:

results/thesis_summary_table.csv - Ready for thesis tables
results/confusion_matrix_comparison.csv - Detailed error breakdown
results/error_analysis_comprehensive.png - Publication-quality visualization
results/comparison_metrics.json - Cached metrics for visualizations

Note: The regenerate_visualizations.py script loads metrics from results/comparison_metrics.json and regenerates visualizations without re-running the time-consuming quantum simulation. Useful for adjusting plot aesthetics or text.

Running Tests

# Test individual modules
pixi run python src/data_loader.py
pixi run python src/preprocessing.py
pixi run python src/classical_svm.py

Code Style

Type hints for all function parameters
Docstrings in Google style
English comments
PEP 8 compliant

References

German Credit Data: OpenML
Qiskit Machine Learning: Documentation
Scikit-learn SVM: User Guide

Author

Gregor Kobilarov

License

This project is licensed under the MIT License - see the LICENSE file for details.

This project was created for educational purposes as part of a university course.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
notebooks		notebooks
results		results
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
analysis.py		analysis.py
create_error_analysis_plot.py		create_error_analysis_plot.py
main.py		main.py
pixi.lock		pixi.lock
pixi.toml		pixi.toml
regenerate_visualizations.py		regenerate_visualizations.py
verify_pixi.py		verify_pixi.py

Folders and files

Latest commit

History

Repository files navigation

Quantum vs Classical SVM Credit Risk Classification: Empirical Benchmark Study

Table of Contents

Academic Context

Research Question

Hypothesis

Null Hypotheses (H0)

Alternative Hypotheses (H1)

Expected Outcome

Dataset Characteristics

Project Architecture

Key Features

Modular Design

Critical Pre-processing Pipeline

Installation

Prerequisites

Setup with Pixi (Recommended)

Alternative Setup (pip)

Usage

Quick Start

Scalability Testing (Subset Mode)

Interactive Exploration (Jupyter Notebook)

Advanced Usage

Using Individual Modules

Glossary for Beginners

Machine Learning Concepts

Performance Metrics Explained

Quantum Computing Concepts

This Project's Approach

Evaluation Metrics

Experimental Results

Performance Comparison

Computational Efficiency

Key Findings

Scalability Analysis

Visualizations

Comprehensive Comparison Summary

ROC Curve Comparison

Precision-Recall Curve Comparison

Deep Error Analysis

Technical Notes

PCA Component Selection

Kernel Comparison

Troubleshooting

Common Issues

Development

Running Analysis Scripts

Running Tests

Code Style

References

Author

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages