FLAIR

Factored Level And Interleaved Ridge: a single-equation time series forecasting method.

Zero hyperparameters. One SVD. CPU only.

#1 on Chronos Benchmark II (25 zero-shot datasets). Agg. Rel. MASE 0.678, Rel. WQL 0.716 — beats AutoARIMA (0.742) by 3.5%
Matches PatchTST on GIFT-Eval (97 configs, 23 datasets). relMASE 0.838 (beats PatchTST 0.849), relCRPS 0.587 (ties PatchTST)
~1000 lines of pure NumPy/SciPy. No deep learning, no foundation models, no GPU.

Pipeline

FLAIR reshapes a time series by its primary period, then separates what happens (level) from how it happens (shape):

y(phase, period) = Level(period) × Shape(phase)

Shape is structural (not learned), so it does not overfit. Level is a smooth, compressed series (one value per period instead of P values) forecast by Ridge regression. Two compressions happen simultaneously: summing P phases into one Level value reduces noise by ~√P, and forecasting Level requires only ⌈H/P⌉ recursive steps instead of H.

Quick Start

import numpy as np
from flaircast import forecast, FLAIR

y = np.random.rand(500) * 100  # your time series

# ── Functional API ───────────────────────────
samples = forecast(y, horizon=24, freq='H')
point   = samples.mean(axis=0)           # (24,)
lo, hi  = np.percentile(samples, [10, 90], axis=0)

# ── Class API (handy in loops) ───────────────
model   = FLAIR(freq='H')
samples = model.predict(y, horizon=24)

# ── With exogenous variables (weather, prices, holidays, ...) ─
X_hist   = np.column_stack([temperature, humidity, is_holiday])  # (n, 3)
X_future = np.column_stack([temp_fcst,   hum_fcst,   hol_fcst])  # (24, 3)
samples  = forecast(y, horizon=24, freq='H',
                    X_hist=X_hist, X_future=X_future)

# ── From pandas ──────────────────────────────
import pandas as pd
ts = pd.read_csv('data.csv')['value']
samples = forecast(ts.values, horizon=12, freq='M')

Installation

pip install flaircast

Or install from source:

git clone https://github.com/TakatoHonda/FLAIR.git
cd FLAIR
pip install .

Supported Frequencies

Freq string	Period	Meaning	MDL candidates
`S`	60	Second	60
`T` / `min`	60	Minute	60
`5T`	12	5-minute	12, 288
`10T`	6	10-minute	6, 144
`15T`	4	15-minute	4, 96
`30T` / `30min`	48	30-minute	48, 336
`10S`	6	10-second	6, 360
`H` / `h`	24	Hourly	24, 168
`D`	7	Daily	7, 365
`W`	52	Weekly	52
`M` / `ME` / `MS`	12	Monthly	12
`Q` / `QE` / `QS`	4	Quarterly	4
`A` / `Y` / `YE`	1	Annual	—

BIC on the SVD spectrum selects the period that best supports a rank-1 structure. A P=1 null model (mean + noise) competes with every periodic candidate under the same BIC, so FLAIR rejects periodicity when the rank-1 fit does not justify the extra Shape parameters.

How It Works

MDL Period Selection: BIC on SVD spectrum selects the primary period P from calendar candidates. A P=1 null model (mean + noise) tests whether periodicity exists at all
Reshape the series into a (P × n_complete) matrix. Dynamic DoF guard (n_train >= 2p) ensures the Ridge fit is stable
Shape = frozen global average of within-period proportions from the last K=2 periods
Level = period totals, denoised by Gavish-Donoho 2014 optimal Frobenius shrinkage (reuses the BIC SVD, no extra matrix decomposition)
Shape₂ = secondary periodic pattern in Level, estimated as w × raw + (1−w) × prior, where w = nc₂/(nc₂+cp). The prior is selected by BIC: first harmonic (2 params) when justified, flat (0 params) otherwise. Level is deseasonalized by dividing by Shape₂
Ridge on deseasonalized Level: Box-Cox → prior-centered reparameterization (random-walk prior) → intercept + trend + lags → LOOCV soft-average
Stochastic Level paths: bootstrap of LOOCV residuals, scaled by LWCP leverages per horizon step
Phase noise: scenario-coherent column sampling from the rank-1 residual matrix, with James-Stein per-phase bias shrinkage and horizon-adaptive deflation. Combined with Level paths: sample = Level_path × Shape × (1 + phase_noise)

Benchmark Results

Chronos Benchmark II (25 zero-shot datasets)

Evaluated on the Chronos Benchmark II protocol (Ansari et al., 2024). Agg. Relative Score = geometric mean of (method / Seasonal Naive) per dataset. Lower is better.

Rank	Model	Params	Agg. Rel. MASE	Agg. Rel. WQL	GPU
1	FLAIR	0 HP	0.678	0.716	No
2	Chronos-Bolt-Base	205M	0.791	—	Yes
3	Moirai-Base	311M	0.812	—	Yes
4	AutoARIMA	—	0.865	0.742	No
5	Chronos-T5-Small	46M	0.830	—	Yes
6	Seasonal Naive	—	1.000	1.000	No

Baseline results from autogluon/fev and amazon-science/chronos-forecasting.

GIFT-Eval (97 configs, 23 datasets)

GIFT-Eval. 7 domains, short/medium/long horizons, 53 non-agentic methods (no test leakage):

Model	Type	relMASE	relCRPS	Params	GPU
Chronos-Bolt-Base	Foundation	0.808	0.574	205M	Yes
FLAIR	Statistical	0.838	0.587	0 HP	No
PatchTST	Deep Learning	0.849	0.587	~1M	Yes
Chronos-Large	Foundation	0.870	0.647	710M	Yes
Moirai-Large	Foundation	0.875	0.599	311M	Yes
TimesFM	Foundation	0.889	0.635	200M	Yes
Chronos-Small	Foundation	0.892	0.663	46M	Yes
iTransformer	Deep Learning	0.893	0.620	~5M	Yes
TFT	Deep Learning	0.915	0.605	~10M	Yes
N-BEATS	Deep Learning	0.938	0.816	~10M	Yes
Seasonal Naive	Baseline	1.000	1.000	0	No
DLinear	Deep Learning	1.061	0.846	~0.1M	Yes
AutoARIMA	Statistical	1.074	0.912	~5	No
AutoTheta	Statistical	1.090	1.244	~5	No
DeepAR	Deep Learning	1.343	0.853	~10M	Yes
Prophet	Statistical	1.540	1.061	~20	No

Long-term Forecasting (8 datasets)

Standard benchmark from PatchTST, iTransformer, DLinear, Autoformer. Channel-independent (univariate) evaluation. MSE on StandardScaler-normalized data. Horizons: {96, 192, 336, 720}.

Average MSE across all 4 horizons:

Dataset	FLAIR	iTransformer	PatchTST	DLinear	GPU needed
ETTh2	0.367	0.383	0.387	0.559	No
ETTm2	0.246	0.288	0.281	0.350	No
Weather	0.258	0.258	0.259	0.265	No
Traffic	0.426	0.428	0.481	0.625	No
ECL	0.208	0.178	0.205	0.212	Yes
ETTh1	0.579	0.454	0.469	0.456	Yes
ETTm1	0.546	0.407	0.387	0.403	Yes
Exchange	0.522	0.360	0.366	0.354	Yes

FLAIR outperforms GPU-trained Transformers on 4 of 8 datasets (ETTh2, ETTm2, Weather, Traffic). Accuracy is higher on datasets with clear periodicity and lower on non-periodic series (Exchange).

Why does FLAIR work?

Three compressions act simultaneously:

Noise reduction: summing P phases into one Level value reduces noise by ~√P
Horizon compression: forecasting Level requires only ⌈H/P⌉ steps instead of H, reducing error accumulation
Shape is frozen: Shape is a structural average, not a learned parameter, so it does not overfit

API Reference

`forecast(y, horizon, freq, n_samples=200, seed=None, X_hist=None, X_future=None)`

Generate probabilistic forecasts for a univariate time series.

Parameter	Type	Description
`y`	array-like (n,)	Historical observations
`horizon`	int	Number of steps to forecast
`freq`	str	Frequency string (see table)
`n_samples`	int	Number of sample paths (default: 200)
`seed`	int or None	Random seed for reproducibility (default: None)
`X_hist`	array-like (n, k) or (n,) or None	Historical exogenous variables aligned with `y`. Must be provided together with `X_future`.
`X_future`	array-like (horizon, k) or (horizon,) or None	Future exogenous values for the forecast horizon. Must be provided together with `X_hist`.

Returns: ndarray of shape (n_samples, horizon). Probabilistic forecast sample paths.

from flaircast import forecast
samples = forecast(y, horizon=24, freq='H')
point   = samples.mean(axis=0)
median  = np.median(samples, axis=0)
lo, hi  = np.percentile(samples, [10, 90], axis=0)

# With exogenous variables
samples = forecast(y, horizon=24, freq='H',
                   X_hist=X_hist, X_future=X_future)

When X_hist=None (the default) the result is bit-identical to a call without the exog arguments.

`FLAIR(freq, n_samples=200, seed=None)`

Class wrapper. Useful when forecasting multiple series with the same frequency.

Method	Description
`predict(y, horizon, n_samples=None, seed=None, X_hist=None, X_future=None)`	Same as `forecast()`, uses instance defaults

from flaircast import FLAIR
model = FLAIR(freq='D', n_samples=500)
for series, X_h, X_f in dataset:
    samples = model.predict(series, horizon=7, X_hist=X_h, X_future=X_f)

Exogenous variables

FLAIR accepts an arbitrary number of per-step exogenous columns. The columns are z-scored using training-window statistics, aggregated to the per-period (Level) timescale via period mean, and appended directly to the Level Ridge feature matrix. No new hyperparameters, no model selection — the existing LOOCV soft-averaged Ridge inside _ridge_sa handles regularization, so noise covariates are naturally damped without any explicit gating step. "One Ridge" is preserved.

Recommended setup: at least a few dozen complete periods of training data (e.g. 60–90 days for daily exog, 60+ days for hourly exog) for stable coefficient estimates.
Validated improvements: see validation/ for rolling-origin benchmarks. UCI Bike Sharing daily: MASE −9.4% (9/12 origins win). Jena Climate hourly: MASE −15.5% (19/24 origins win).
Graceful degradation: passing pure-noise exog inflates MASE by less than 1% on average, with bounded worst-case behavior.
Limitation: exog is coupled to the Level (per-period) factor only. Intra-period variation in X (e.g. hourly temperature within a daily period) is collapsed by the period mean and is not captured.

End-to-end walkthrough on the UCI Bike Sharing dataset:

Constants

Name	Description
`FREQ_TO_PERIOD`	Maps frequency strings to primary periods
`FREQ_TO_PERIODS`	Maps frequency strings to MDL candidate periods

Design Principles

FLAIR applies the Minimum Description Length principle at every scale:

Scale	Mechanism	MDL Role
Period P	BIC on SVD spectrum + P=1 null	Select simplest rank-1 structure or reject periodicity
Rank-1 σ₁	Gavish-Donoho shrinkage	Minimax-optimal denoising of the leading singular value
Shape	Frozen K-period average	Structural (not learned), cannot overfit
Shape₂	BIC-gated shrinkage	BIC selects prior: harmonic (2 params) vs flat (0 params)
Ridge α	LOOCV soft-average	Select model complexity via cross-validation
DoF guard	n_train >= 2p	Ensures LOOCV leverage stability

Limitations

Non-periodic series: the Level × Shape decomposition provides no compression benefit when there is no periodicity (e.g., exchange rates). Use a dedicated non-periodic model instead
Intermittent demand: series with >30% zeros are poorly served by the multiplicative structure. Croston-type methods are better suited
Coarse exogenous resolution: X_hist / X_future are aggregated to the per-period (Level) timescale via period mean. Intra-period variation in covariates (e.g. hourly weather within a daily period) is dropped by design
Short series: fewer than 3 complete periods forces P=1 degeneration (plain Ridge on raw series)

Citation

@misc{flair2026,
  title={FLAIR: Factored Level And Interleaved Ridge for Time Series Forecasting},
  year={2026}
}

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 199 Commits
.github/workflows		.github/workflows
assets		assets
examples		examples
flaircast		flaircast
tests		tests
validation		validation
.gitignore		.gitignore
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
README_ja.md		README_ja.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FLAIR

Table of Contents

Pipeline

Quick Start

Installation

Supported Frequencies

How It Works

Benchmark Results

Chronos Benchmark II (25 zero-shot datasets)

GIFT-Eval (97 configs, 23 datasets)

Long-term Forecasting (8 datasets)

Why does FLAIR work?

API Reference

`forecast(y, horizon, freq, n_samples=200, seed=None, X_hist=None, X_future=None)`

`FLAIR(freq, n_samples=200, seed=None)`

Exogenous variables

Constants

Design Principles

Limitations

Citation

License

About

Uh oh!

Releases 6

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FLAIR

Table of Contents

Pipeline

Quick Start

Installation

Supported Frequencies

How It Works

Benchmark Results

Chronos Benchmark II (25 zero-shot datasets)

GIFT-Eval (97 configs, 23 datasets)

Long-term Forecasting (8 datasets)

Why does FLAIR work?

API Reference

forecast(y, horizon, freq, n_samples=200, seed=None, X_hist=None, X_future=None)

FLAIR(freq, n_samples=200, seed=None)

Exogenous variables

Constants

Design Principles

Limitations

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 6

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`forecast(y, horizon, freq, n_samples=200, seed=None, X_hist=None, X_future=None)`

`FLAIR(freq, n_samples=200, seed=None)`

Packages