A comparative study of Classical SVM vs Quantum SVM (QSVM) for detecting fraudulent credit card transactions, implemented using Qiskit and scikit-learn.
This project explores the potential of Quantum Machine Learning (QML) in a real-world binary classification task — credit card fraud detection. A ZZFeatureMap-based Quantum Kernel is used to train a QSVM on a PCA-reduced feature space, and its performance is compared against a tuned classical RBF-SVM.
| Model | Accuracy | F1 Score |
|---|---|---|
| Classical SVM | 90.00% | 0.8947 |
| Quantum SVM | 57.50% | 0.6304 |
PCA variance retained: 61.56% across 4 components
Test set size: 80 samples
The classical SVM significantly outperforms the QSVM in this experiment. The QSVM's lower performance is expected at this scale — quantum advantage typically emerges with higher-dimensional, more complex feature spaces that are difficult for classical kernels to separate.
The quantum feature map used is a ZZFeatureMap with reps=2 and linear entanglement across 4 qubits (one per PCA component). It encodes classical data into quantum states via parameterized rotation and entangling gates, enabling the quantum kernel to capture non-linear relationships in the feature space.
- Python 3.13
qiskit— Quantum circuit constructionqiskit-machine-learning—FidelityQuantumKernelqiskit-aer— Local quantum simulatorscikit-learn— SVM, PCA, GridSearchCV, metricspandas,numpy— Data handlingmatplotlib,seaborn— Visualization
qsvm-fraud-detection/
├── asset/
│ ├── classicalvsquantum_confusionmatrix.png
│ └── Quantum Kernel.png
├── data/
│ └── creditcard.csv # Not included (see below)
├── qsvm_fraud.ipynb # Main notebook
├── .gitignore
└── README.md
git clone https://github.com/your-username/qsvm-fraud-detection.git
cd qsvm-fraud-detectionpython -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activatepip install qiskit qiskit-machine-learning qiskit-aer scikit-learn pandas numpy matplotlib seaborn jupyterDownload the Credit Card Fraud Detection dataset from Kaggle and place creditcard.csv inside the data/ folder.
jupyter notebook qsvm_fraud.ipynb- Data Loading & Balancing — Undersample the majority class (valid transactions) to match the minority class (fraud), resulting in a balanced 50/50 split.
- Subset Sampling — Use 400 samples to keep quantum simulation tractable.
- Train/Test Split — Stratified 80/20 split before any preprocessing (preventing data leakage).
- Standardization —
StandardScalerfit on training data only, then applied to test data. - PCA — Dimensionality reduction to 4 components, fit on training data only.
- Classical SVM — Tuned
RBF-SVMviaGridSearchCV(C, gamma) with 5-fold cross-validation. - Quantum SVM —
ZZFeatureMap(reps=2, linear entanglement) +FidelityQuantumKernelvia Qiskit Aer simulator. - Evaluation — Accuracy, F1 Score, and confusion matrices for both models.
- Quantum simulation is slow — The QSVM uses a classical Aer simulator, not real quantum hardware. Kernel matrix computation scales as O(n²) in the number of training samples.
- Low PCA variance retained — Only ~61.56% of variance is preserved with 4 components, which impacts QSVM accuracy.
- Small dataset — 400 samples is a necessary trade-off for simulation feasibility.
This project is open-source and available under the MIT License.
Built with curiosity about the intersection of quantum computing and machine learning.
Feel free to open an issue or submit a PR!

