The missing preprocessing layer between classical datasets and quantum computing frameworks.
QuPrep converts classical datasets into quantum-circuit-ready format. It is not a quantum computing framework, simulator, or training tool — it is the preprocessing step that feeds into Qiskit, PennyLane, Cirq, TKET, and any other quantum workflow.
CSV / DataFrame / NumPy / images / text / graphs → QuPrep → circuit-ready output
- Ingest tabular data, time series, images, text, and graphs — all in the same pipeline API
- Clean, normalize, and reduce dimensionality to fit your hardware qubit budget
- Encode data into circuits using 13 encoding methods (Angle, Amplitude, IQP, ZZFeatureMap, GraphState, and more)
- Recommend, compare, and auto-select the best encoding for your dataset and task
- Export circuits to 8 frameworks: OpenQASM 3.0, Qiskit, PennyLane, Cirq, TKET, Braket, Q#, IQM
- Formulate combinatorial optimization problems as QUBO / Ising models; export as QAOA circuit templates for your quantum framework
QuPrep does not train models, simulate circuits, run on quantum hardware, or optimize variational parameters.
pip install quprepWith optional extras:
# Framework exporters
pip install quprep[qiskit] # Qiskit QuantumCircuit
pip install quprep[pennylane] # PennyLane QNode
pip install quprep[cirq] # Cirq Circuit
pip install quprep[tket] # TKET/pytket Circuit
pip install quprep[braket] # Amazon Braket Circuit
pip install quprep[qsharp] # Q# / Azure Quantum
pip install quprep[iqm] # IQM native format
pip install quprep[frameworks] # all framework exporters at once
# Data modalities
pip install quprep[image] # image ingestion (Pillow)
pip install quprep[text] # text embeddings (sentence-transformers, ~2 GB)
pip install quprep[modalities] # image + text at once
# Other
pip install quprep[umap] # UMAP dimensionality reduction
pip install quprep[viz] # matplotlib circuit diagrams
pip install quprep[all] # everythingRequirements: Python ≥ 3.10. Core dependencies: numpy, scipy, pandas, scikit-learn.
import quprep as qd
result = qd.prepare("data.csv", encoding="angle", framework="qasm")
print(result.circuit)import quprep as qd
pipeline = qd.Pipeline(
cleaner=qd.Imputer(),
reducer=qd.PCAReducer(n_components=8),
encoder=qd.IQPEncoder(reps=2),
exporter=qd.PennyLaneExporter(), # pip install quprep[pennylane]
)
result = pipeline.fit_transform("data.csv")
qnode = result.circuit # callable qml.QNodeimport quprep as qd
# Time series — sliding window then encode
from quprep.ingest.time_series_ingester import TimeSeriesIngester
from quprep.clean.window_transformer import WindowTransformer
result = qd.Pipeline(
preprocessor=WindowTransformer(window_size=5, step=1),
encoder=qd.AngleEncoder(),
).fit_transform(TimeSeriesIngester(time_column="date").load("sensor.csv"))
# Images — pip install quprep[image]
from quprep.ingest.image_ingester import ImageIngester
result = qd.prepare("images/", encoding="angle", ingester=ImageIngester(size=(8, 8), grayscale=True))
# Text — TF-IDF (no deps) or sentence-transformers (pip install quprep[text])
from quprep.ingest.text_ingester import TextIngester
texts = ["quantum computing is powerful", "machine learning meets QML", ...]
result = qd.prepare(texts, encoding="angle", ingester=TextIngester(method="tfidf", max_features=16))
# Graphs — lossless graph state encoding
from quprep.ingest.graph_ingester import GraphIngester
from quprep.encode.graph_state import GraphStateEncoder
import numpy as np
graph_list = [np.array([[0,1,1],[1,0,0],[1,0,0]], dtype=float), ...] # adjacency matrices
result = qd.Pipeline(encoder=GraphStateEncoder()).fit_transform(
GraphIngester(features="adjacency").load(graph_list)
)| Feature | Docs |
|---|---|
| Encoding recommendation — ranked by dataset profile and task | guide |
| Qubit budget suggestion — NISQ-safe ceiling with reasoning | API |
| Side-by-side encoder comparison — depth, gates, NISQ safety | API |
| Data drift detection — warn when new data leaves training distribution | API |
| Pipeline save / load — serialize fitted pipelines, no re-fitting | API |
| Schema validation & cost estimation — gate count before encoding | guide |
| QUBO / Ising formulation — Max-Cut, TSP, Knapsack, QAOA circuits, D-Wave export | guide |
| Plugin system — register custom encoders and exporters | guide |
| Circuit visualization — ASCII (no deps) or matplotlib | API |
| Batch QASM export — save all samples to disk as individual files | API |
| Encoding | Qubits | Depth | NISQ-safe | Best for |
|---|---|---|---|---|
| Angle (Ry/Rx/Rz) | n = d | O(1) | ✅ Excellent | Most QML tasks |
| Amplitude | ⌈log₂ d⌉ | O(2ⁿ) | ❌ Poor | Qubit-limited scenarios |
| Basis | n = d | O(1) | ✅ Excellent | Binary features / QAOA |
| Entangled Angle | n = d | O(d · layers) | ✅ Good | Feature correlations |
| IQP | n = d | O(d² · reps) | Kernel methods | |
| Re-uploading | n = d | O(d · layers) | ✅ Good | High-expressivity QNNs |
| Hamiltonian | n = d | O(d · steps) | Physics simulation / VQE | |
| ZZ Feature Map | n = d | O(d² · reps) | Quantum kernel methods | |
| Pauli Feature Map | n = d | O(d² · reps) | Configurable kernel methods | |
| Random Fourier | n_components | O(1) | ✅ Excellent | RBF kernel approximation |
| Tensor Product | ⌈d/2⌉ | O(1) | ✅ Excellent | Qubit-efficient encoding |
| QAOA Problem | n = d | O(p) | ✅ Good | QAOA warm-start, problem-inspired maps |
| Graph State | n = nodes | O(edges) | ✅ Good | Graph-structured data (lossless) |
| Framework | Install | Output |
|---|---|---|
| OpenQASM 3.0 | (included) | str |
| Qiskit | pip install quprep[qiskit] |
QuantumCircuit |
| PennyLane | pip install quprep[pennylane] |
qml.QNode |
| Cirq | pip install quprep[cirq] |
cirq.Circuit |
| TKET | pip install quprep[tket] |
pytket.Circuit |
| Amazon Braket | pip install quprep[braket] |
braket.Circuit |
| Q# | pip install quprep[qsharp] |
Q# operation string |
| IQM | pip install quprep[iqm] |
IQM circuit JSON |
Full documentation at docs.quprep.org
Contributions are welcome. Please read CONTRIBUTING.md before opening a pull request.
- Open an issue for bugs or feature requests
- Start a discussion for questions or ideas
Apache 2.0 — see LICENSE.
If you use QuPrep in your research, please cite:
@software{quprep2026,
author = {Perera, Hasarindu},
title = {QuPrep: Quantum Data Preparation},
year = {2026},
publisher = {Zenodo},
version = {0.7.0},
doi = {10.5281/zenodo.19286258},
url = {https://doi.org/10.5281/zenodo.19286258},
license = {Apache-2.0},
}