GitHub - drumtorben/polars-ts: Polars Time Series Extension

Documentation • Source Code • PyPI

polars-ts is a batteries-included time series toolkit built on Polars. It gives you Rust-accelerated distance metrics, 10+ clustering algorithms, a full forecasting stack, and diagnostics — all from a single pip install, no heavyweight frameworks required.

Why polars-ts?

Pain point	How polars-ts helps
"I need DTW but scipy is slow"	12 distance metrics compiled to native code via Rust + Rayon, orders of magnitude faster on large panels
"I want to cluster time series but tslearn/sktime have too many deps"	K-Medoids, K-Shape, HDBSCAN, Spectral, Hierarchical, K-Means DBA, CLARA/CLARANS, U-Shapelets — all built-in, optional `scikit-learn` only for density methods
"Setting up a forecast pipeline takes too long"	`ForecastPipeline` wires up lags, rolling stats, calendar features, target transforms, and any sklearn model in 5 lines
"I don't know which clustering method to pick"	`auto_cluster` sweeps methods × distances × k values and returns the best result with evaluation scores
"Polars doesn't have time series functions"	Mann-Kendall, Sen's slope, CUSUM, PELT, decomposition, ACF/PACF — all group-aware and Polars-native

TL;DR — what you can do in 3 lines

import polars_ts as pts

# Cluster 1 000 series by shape similarity
labels = pts.auto_cluster(df, methods=["kmedoids", "spectral"], distances=["sbd", "dtw"])

# Forecast with a full ML pipeline
pipe = pts.ForecastPipeline(model, lags=[1,7,14], rolling_windows=[7], calendar=["day_of_week"])
pipe.fit(train); forecasts = pipe.predict(train, h=7)

# Detect changepoints
breaks = pts.pelt(df, cost="meanvar", pen=10)

Installation

pip install polars-timeseries

Extras for optional features:

pip install "polars-timeseries[clustering]"     # HDBSCAN, DBSCAN, spectral (sklearn + scipy)
pip install "polars-timeseries[forecast]"       # SCUM, auto_arima (statsforecast)
pip install "polars-timeseries[decomposition]"  # Fourier decomposition (polars-ds)
pip install "polars-timeseries[all]"            # Everything

Requires Python 3.12+ and Polars 1.30+.

Quick start

Pairwise DTW distance

import polars as pl
import polars_ts as pts

df = pl.DataFrame({
    "unique_id": ["A"] * 5 + ["B"] * 5,
    "y": [1.0, 2.0, 3.0, 2.0, 1.0,
          1.0, 3.0, 5.0, 3.0, 1.0],
})

result = pts.compute_pairwise_dtw(df, df)

Auto-cluster time series

result = pts.auto_cluster(
    df,
    methods=["kmedoids", "spectral", "kshape"],
    distances=["sbd", "dtw"],
    k_range=range(2, 6),
)
print(result.best_method, result.best_k, result.best_score)
print(result.best_labels)  # DataFrame[unique_id, cluster]

End-to-end forecast pipeline

from sklearn.ensemble import GradientBoostingRegressor
import polars_ts as pts

pipe = pts.ForecastPipeline(
    GradientBoostingRegressor(),
    lags=[1, 2, 7],
    rolling_windows=[7],
    calendar=["day_of_week", "month"],
    target_transform="log",
)
pipe.fit(train_df)
forecasts = pipe.predict(train_df, h=7)

ARIMA forecasting

import polars_ts as pts

# Fit ARIMA(1,1,1) and forecast 12 steps ahead
fitted = pts.arima_fit(df, order=(1, 1, 1))
forecast = pts.arima_forecast(fitted, h=12)

# Or use automatic order selection
forecast = pts.auto_arima(df, h=12, season_length=12)

Exponential smoothing

import polars_ts as pts

# Holt-Winters seasonal forecast
result = pts.holt_winters_forecast(df, h=12, season_length=12, seasonal="additive")

Conformal prediction intervals

import polars_ts as pts

# Distribution-free prediction intervals
result = pts.conformal_interval(cal_residuals, predictions, coverage=0.9)

Weighted ensemble

import polars_ts as pts

ens = pts.WeightedEnsemble(weights="inverse_error")
combined = ens.combine([forecast_a, forecast_b], validation_dfs=[val_a, val_b])

Mann-Kendall trend test

import polars as pl
import polars_ts as pts

df = pl.DataFrame({
    "group": ["A"] * 10 + ["B"] * 10,
    "y": list(range(10)) + [10 - x for x in range(10)],
})

result = df.group_by("group").agg(
    pts.mann_kendall(pl.col("y")).alias("trend"),
    pts.sens_slope(pl.col("y")).alias("slope"),
)

Seasonal decomposition

import polars as pl
import polars_ts as pts

df = pl.DataFrame({
    "unique_id": ["A"] * 48,
    "ds": list(range(48)),
    "y": [10 + 5 * (i % 12 > 5) + 0.5 * i for i in range(48)],
})

result = pts.seasonal_decomposition(df, freq=12, method="additive")

Features

Distance metrics _{Rust, parallelized via Rayon}

All distance functions return a tidy DataFrame with columns [id_1, id_2, <metric>]. A unified compute_pairwise_distance(method=...) API lets you swap metrics with a single string.

Metric	Function	Key Parameters
Dynamic Time Warping	`compute_pairwise_dtw`	`method`: standard, sakoe_chiba, itakura, fast
Derivative DTW	`compute_pairwise_ddtw`	Shape-sensitive comparison
Weighted DTW	`compute_pairwise_wdtw`	`g`: weight sharpness
Move-Split-Merge	`compute_pairwise_msm`	`c`: move cost
Edit Distance (Real Penalty)	`compute_pairwise_erp`	`g`: gap value
Longest Common Subsequence	`compute_pairwise_lcss`	`epsilon`: matching threshold
Time Warp Edit Distance	`compute_pairwise_twe`	`nu`: stiffness, `lambda_`: deletion cost
Shape-Based Distance	`compute_pairwise_sbd`	Cross-correlation based
Frechet Distance	`compute_pairwise_frechet`	Geometric coupling distance
Edit Distance on Real Sequences	`compute_pairwise_edr`	Edit-operation cost
Multivariate DTW	`compute_pairwise_dtw_multi`	`metric`: manhattan, euclidean
Multivariate MSM	`compute_pairwise_msm_multi`	`c`: move cost

Clustering & classification

Method	Function	When to use
K-Medoids (PAM)	`kmedoids`	Known k, any distance metric, interpretable medoids
K-Shape	`KShape`	Shape-based grouping via cross-correlation centroids
Spectral (KSC)	`spectral_cluster`	Non-convex clusters, graph Laplacian structure
HDBSCAN	`hdbscan_cluster`	Unknown k, varying density, noise detection
DBSCAN	`dbscan_cluster`	Fixed-radius neighbourhood, noise detection
Hierarchical	`agglomerative_cluster`	Dendrogram visualization, flexible linkage
K-Means DBA	`kmeans_dba`	DTW Barycentric Averaging centroids
CLARA	`clara`	Scalable k-medoids via sampling
CLARANS	`clarans`	Randomized k-medoids neighbourhood search
U-Shapelets	`shapelet_cluster`	Interpretable sub-sequence patterns
ROCKET / MiniRocket	`rocket_features`, `minirocket_features`	Random convolutional kernel feature extraction
Auto-cluster	`auto_cluster`	Sweep methods × distances × k, pick the best

Evaluation: silhouette_score, davies_bouldin_score, calinski_harabasz_score

Classification: knn_classify (distance-based k-NN), TimeSeriesKNNClassifier (OOP), KShapeClassifier (centroid-based)

Trend & changepoint detection

Mann-Kendall test — non-parametric trend detection (Rust)
Sen's slope — robust trend magnitude estimation (Rust)
CUSUM — cumulative sum changepoint detection (Rust)
PELT — multiple changepoints with mean/variance/meanvar cost functions
BOCPD — Bayesian Online Changepoint Detection
Regime detection — Hidden Markov Model state inference

Decomposition

Seasonal decomposition — additive or multiplicative (classical)
Fourier decomposition — harmonic decomposition with configurable frequencies
Decomposition features — trend/seasonal strength extraction (simple or MSTL)
Anomaly flagging — residual-based anomaly detection from any decomposition

Feature engineering

Lag features — create lagged versions of a target column per group
Rolling features — rolling window aggregations (mean, std, min, max, sum, median, var)
Calendar features — extract day_of_week, month, quarter, is_weekend, etc.
Fourier features — sin/cos pairs for seasonal modelling
Target encoding — smoothed categorical encoding by target mean
Holiday features — binary holidays + distance-to-holiday (requires holidays package)
Interaction features — cross-term column generation
Time embeddings — cyclical sin/cos encoding for time components

Target transforms

Log transform — log1p / expm1 with automatic validation and lossless inversion
Box-Cox transform — parametric power transform with configurable lambda
Differencing — configurable order and seasonal period with metadata for lossless inversion

All transforms are group-aware, invertible, and accessible via the df.pts namespace.

Data preprocessing

Missing value imputation — forward/backward fill, linear interpolation, mean, median, seasonal
Outlier detection — z-score, IQR, Hampel filter, rolling z-score
Outlier treatment — clip (winsorize), median replacement, interpolation, null
Temporal resampling — downsample/upsample with configurable aggregation

Validation strategies

Expanding window CV — growing training window cross-validation
Sliding window CV — fixed-size training window cross-validation
Rolling origin CV — general rolling-origin with configurable initial/fixed train size

Forecasting

SCUM — ensemble model combining AutoARIMA, AutoETS, AutoCES, and DynamicOptimizedTheta
ARIMA/SARIMA — explicit (p,d,q) order via statsmodels (arima_fit/arima_forecast) or automatic selection via statsforecast (auto_arima)
Baseline models — naive, seasonal naive, moving average, and FFT-based forecasts
Exponential smoothing — SES, Holt's linear, Holt-Winters (additive/multiplicative, Rust-accelerated)
Multi-step strategies — RecursiveForecaster and DirectForecaster
ForecastPipeline — end-to-end ML pipeline with feature engineering + transforms
GlobalForecaster — cross-series panel model with optional ID encoding

Probabilistic forecasting

QuantileRegressor — one model per quantile level with CRPS-compatible output
Conformal prediction — distribution-free intervals with coverage guarantees
EnbPI — Ensemble Batch Prediction Intervals with adaptive online updates

Ensembling

WeightedEnsemble — equal, manual, or inverse-error-optimized weights
StackingForecaster — meta-learner trained on out-of-fold predictions

Forecast evaluation & diagnostics

Metrics — MAE, RMSE, MAPE, sMAPE, MASE, CRPS
Kaboudan metric — model robustness evaluation via block-shuffle backtesting
Bias detection & correction — mean, regression, quantile mapping
Calibration diagnostics — calibration table, PIT histogram, reliability diagram
Residual diagnostics — ACF, PACF, Ljung-Box test
Permutation importance — model-agnostic feature importance

Multivariate & hierarchical

VAR — Vector Autoregression with OLS fitting and multi-step forecasts
Granger causality — F-test for causal relationships between series
GARCH — volatility modelling and conditional variance forecasting
Forecast reconciliation — bottom-up, top-down, and MinTrace-OLS

Anomaly detection

Decomposition-based — residual threshold anomaly flagging
Isolation Forest — unsupervised anomaly detection on engineered features

Integration adapters

NeuralForecast — convert to/from N-BEATS, PatchTST, N-HiTS format
PyTorch Forecasting — convert to/from TFT, DeepAR format
HuggingFace — convert to Dataset for Chronos, TimesFM, Lag-Llama
Chronos / MOMENT embeddings — foundation model feature extraction for clustering
ForecastEnv — Gymnasium-compatible RL environment for decision making

Tutorials

The notebooks/ directory contains 10 end-to-end tutorials:

#	Topic
01	Data wrangling & exploration
02	Feature engineering & transforms
03	Forecasting fundamentals
04	ML forecasting pipelines
05	Uncertainty & calibration
06	Changepoint & anomaly detection
07	Time series similarity & clustering
08	Multivariate & volatility
09	Ensembles & reconciliation
10	Ecosystem adapters

Development

git clone https://github.com/drumtorben/polars-ts.git
cd polars-ts
uv sync
uv pip install -e .
uv run pytest

Code quality

Pre-commit hooks run via prek (Rust reimplementation of pre-commit) or standard pre-commit — both read .pre-commit-config.yaml:

# Option A: prek (faster)
uv tool install prek
prek run --all-files

# Option B: standard pre-commit
pre-commit run --all-files

Type checking

# mypy (authoritative)
uv run mypy polars_ts/

# ty (fast, informational — beta)
uvx ty check polars_ts/

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 129 Commits
.github/workflows		.github/workflows
benchmarks		benchmarks
docs		docs
images		images
notebooks		notebooks
polars_ts		polars_ts
src		src
tests		tests
.coveragerc		.coveragerc
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
RELEASE_NOTES.md		RELEASE_NOTES.md
mkdocs.yaml		mkdocs.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

Why polars-ts?

TL;DR — what you can do in 3 lines

Installation

Quick start

Pairwise DTW distance

Auto-cluster time series

End-to-end forecast pipeline

ARIMA forecasting

Exponential smoothing

Conformal prediction intervals

Weighted ensemble

Mann-Kendall trend test

Seasonal decomposition

Features

Distance metrics Rust, parallelized via Rayon

Clustering & classification

Trend & changepoint detection

Decomposition

Feature engineering

Target transforms

Data preprocessing

Validation strategies

Forecasting

Probabilistic forecasting

Ensembling

Forecast evaluation & diagnostics

Multivariate & hierarchical

Anomaly detection

Integration adapters

Tutorials

Development

Code quality

Type checking

License

About

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Distance metrics _{Rust, parallelized via Rayon}

Packages