Skip to content

TakatoHonda/FLAIR

Repository files navigation

FLAIR

PyPI Python PyPI Downloads CI License Open In Colab

日本語版はこちら

Factored Level And Interleaved Ridge: a single-equation time series forecasting method.

Zero hyperparameters. One SVD. CPU only.

  • #1 on Chronos Benchmark II (25 zero-shot datasets). Agg. Rel. MASE 0.678, Rel. WQL 0.716 — beats AutoARIMA (0.742) by 3.5%
  • Matches PatchTST on GIFT-Eval (97 configs, 23 datasets). relMASE 0.838 (beats PatchTST 0.849), relCRPS 0.587 (ties PatchTST)
  • ~1000 lines of pure NumPy/SciPy. No deep learning, no foundation models, no GPU.

Table of Contents

Pipeline

FLAIR reshapes a time series by its primary period, then separates what happens (level) from how it happens (shape):

y(phase, period) = Level(period) × Shape(phase)

FLAIR Pipeline

Shape is structural (not learned), so it does not overfit. Level is a smooth, compressed series (one value per period instead of P values) forecast by Ridge regression. Two compressions happen simultaneously: summing P phases into one Level value reduces noise by ~√P, and forecasting Level requires only ⌈H/P⌉ recursive steps instead of H.

Quick Start

import numpy as np
from flaircast import forecast, FLAIR

y = np.random.rand(500) * 100  # your time series

# ── Functional API ───────────────────────────
samples = forecast(y, horizon=24, freq='H')
point   = samples.mean(axis=0)           # (24,)
lo, hi  = np.percentile(samples, [10, 90], axis=0)

# ── Class API (handy in loops) ───────────────
model   = FLAIR(freq='H')
samples = model.predict(y, horizon=24)

# ── With exogenous variables (weather, prices, holidays, ...) ─
X_hist   = np.column_stack([temperature, humidity, is_holiday])  # (n, 3)
X_future = np.column_stack([temp_fcst,   hum_fcst,   hol_fcst])  # (24, 3)
samples  = forecast(y, horizon=24, freq='H',
                    X_hist=X_hist, X_future=X_future)

# ── From pandas ──────────────────────────────
import pandas as pd
ts = pd.read_csv('data.csv')['value']
samples = forecast(ts.values, horizon=12, freq='M')

Installation

pip install flaircast

Or install from source:

git clone https://github.com/TakatoHonda/FLAIR.git
cd FLAIR
pip install .

Supported Frequencies

Freq string Period Meaning MDL candidates
S 60 Second 60
T / min 60 Minute 60
5T 12 5-minute 12, 288
10T 6 10-minute 6, 144
15T 4 15-minute 4, 96
30T / 30min 48 30-minute 48, 336
10S 6 10-second 6, 360
H / h 24 Hourly 24, 168
D 7 Daily 7, 365
W 52 Weekly 52
M / ME / MS 12 Monthly 12
Q / QE / QS 4 Quarterly 4
A / Y / YE 1 Annual

BIC on the SVD spectrum selects the period that best supports a rank-1 structure. A P=1 null model (mean + noise) competes with every periodic candidate under the same BIC, so FLAIR rejects periodicity when the rank-1 fit does not justify the extra Shape parameters.

How It Works

  1. MDL Period Selection: BIC on SVD spectrum selects the primary period P from calendar candidates. A P=1 null model (mean + noise) tests whether periodicity exists at all
  2. Reshape the series into a (P × n_complete) matrix. Dynamic DoF guard (n_train >= 2p) ensures the Ridge fit is stable
  3. Shape = frozen global average of within-period proportions from the last K=2 periods
  4. Level = period totals, denoised by Gavish-Donoho 2014 optimal Frobenius shrinkage (reuses the BIC SVD, no extra matrix decomposition)
  5. Shape₂ = secondary periodic pattern in Level, estimated as w × raw + (1−w) × prior, where w = nc₂/(nc₂+cp). The prior is selected by BIC: first harmonic (2 params) when justified, flat (0 params) otherwise. Level is deseasonalized by dividing by Shape₂
  6. Ridge on deseasonalized Level: Box-Cox → prior-centered reparameterization (random-walk prior) → intercept + trend + lags → LOOCV soft-average
  7. Stochastic Level paths: bootstrap of LOOCV residuals, scaled by LWCP leverages per horizon step
  8. Phase noise: scenario-coherent column sampling from the rank-1 residual matrix, with James-Stein per-phase bias shrinkage and horizon-adaptive deflation. Combined with Level paths: sample = Level_path × Shape × (1 + phase_noise)

Benchmark Results

Chronos Benchmark II (25 zero-shot datasets)

Evaluated on the Chronos Benchmark II protocol (Ansari et al., 2024). Agg. Relative Score = geometric mean of (method / Seasonal Naive) per dataset. Lower is better.

Chronos Benchmark

Rank Model Params Agg. Rel. MASE Agg. Rel. WQL GPU
1 FLAIR 0 HP 0.678 0.716 No
2 Chronos-Bolt-Base 205M 0.791 Yes
3 Moirai-Base 311M 0.812 Yes
4 AutoARIMA 0.865 0.742 No
5 Chronos-T5-Small 46M 0.830 Yes
6 Seasonal Naive 1.000 1.000 No

Baseline results from autogluon/fev and amazon-science/chronos-forecasting.

GIFT-Eval (97 configs, 23 datasets)

GIFT-Eval. 7 domains, short/medium/long horizons, 53 non-agentic methods (no test leakage):

GIFT-Eval Benchmark

Model Type relMASE relCRPS Params GPU
Chronos-Bolt-Base Foundation 0.808 0.574 205M Yes
FLAIR Statistical 0.838 0.587 0 HP No
PatchTST Deep Learning 0.849 0.587 ~1M Yes
Chronos-Large Foundation 0.870 0.647 710M Yes
Moirai-Large Foundation 0.875 0.599 311M Yes
TimesFM Foundation 0.889 0.635 200M Yes
Chronos-Small Foundation 0.892 0.663 46M Yes
iTransformer Deep Learning 0.893 0.620 ~5M Yes
TFT Deep Learning 0.915 0.605 ~10M Yes
N-BEATS Deep Learning 0.938 0.816 ~10M Yes
Seasonal Naive Baseline 1.000 1.000 0 No
DLinear Deep Learning 1.061 0.846 ~0.1M Yes
AutoARIMA Statistical 1.074 0.912 ~5 No
AutoTheta Statistical 1.090 1.244 ~5 No
DeepAR Deep Learning 1.343 0.853 ~10M Yes
Prophet Statistical 1.540 1.061 ~20 No

Long-term Forecasting (8 datasets)

Standard benchmark from PatchTST, iTransformer, DLinear, Autoformer. Channel-independent (univariate) evaluation. MSE on StandardScaler-normalized data. Horizons: {96, 192, 336, 720}.

Average MSE across all 4 horizons:

Dataset FLAIR iTransformer PatchTST DLinear GPU needed
ETTh2 0.367 0.383 0.387 0.559 No
ETTm2 0.246 0.288 0.281 0.350 No
Weather 0.258 0.258 0.259 0.265 No
Traffic 0.426 0.428 0.481 0.625 No
ECL 0.208 0.178 0.205 0.212 Yes
ETTh1 0.579 0.454 0.469 0.456 Yes
ETTm1 0.546 0.407 0.387 0.403 Yes
Exchange 0.522 0.360 0.366 0.354 Yes

FLAIR outperforms GPU-trained Transformers on 4 of 8 datasets (ETTh2, ETTm2, Weather, Traffic). Accuracy is higher on datasets with clear periodicity and lower on non-periodic series (Exchange).

Why does FLAIR work?

Three compressions act simultaneously:

  1. Noise reduction: summing P phases into one Level value reduces noise by ~√P
  2. Horizon compression: forecasting Level requires only ⌈H/P⌉ steps instead of H, reducing error accumulation
  3. Shape is frozen: Shape is a structural average, not a learned parameter, so it does not overfit

API Reference

forecast(y, horizon, freq, n_samples=200, seed=None, X_hist=None, X_future=None)

Generate probabilistic forecasts for a univariate time series.

Parameter Type Description
y array-like (n,) Historical observations
horizon int Number of steps to forecast
freq str Frequency string (see table)
n_samples int Number of sample paths (default: 200)
seed int or None Random seed for reproducibility (default: None)
X_hist array-like (n, k) or (n,) or None Historical exogenous variables aligned with y. Must be provided together with X_future.
X_future array-like (horizon, k) or (horizon,) or None Future exogenous values for the forecast horizon. Must be provided together with X_hist.

Returns: ndarray of shape (n_samples, horizon). Probabilistic forecast sample paths.

from flaircast import forecast
samples = forecast(y, horizon=24, freq='H')
point   = samples.mean(axis=0)
median  = np.median(samples, axis=0)
lo, hi  = np.percentile(samples, [10, 90], axis=0)

# With exogenous variables
samples = forecast(y, horizon=24, freq='H',
                   X_hist=X_hist, X_future=X_future)

When X_hist=None (the default) the result is bit-identical to a call without the exog arguments.

FLAIR(freq, n_samples=200, seed=None)

Class wrapper. Useful when forecasting multiple series with the same frequency.

Method Description
predict(y, horizon, n_samples=None, seed=None, X_hist=None, X_future=None) Same as forecast(), uses instance defaults
from flaircast import FLAIR
model = FLAIR(freq='D', n_samples=500)
for series, X_h, X_f in dataset:
    samples = model.predict(series, horizon=7, X_hist=X_h, X_future=X_f)

Exogenous variables

FLAIR accepts an arbitrary number of per-step exogenous columns. The columns are z-scored using training-window statistics, aggregated to the per-period (Level) timescale via period mean, and appended directly to the Level Ridge feature matrix. No new hyperparameters, no model selection — the existing LOOCV soft-averaged Ridge inside _ridge_sa handles regularization, so noise covariates are naturally damped without any explicit gating step. "One Ridge" is preserved.

  • Recommended setup: at least a few dozen complete periods of training data (e.g. 60–90 days for daily exog, 60+ days for hourly exog) for stable coefficient estimates.
  • Validated improvements: see validation/ for rolling-origin benchmarks. UCI Bike Sharing daily: MASE −9.4% (9/12 origins win). Jena Climate hourly: MASE −15.5% (19/24 origins win).
  • Graceful degradation: passing pure-noise exog inflates MASE by less than 1% on average, with bounded worst-case behavior.
  • Limitation: exog is coupled to the Level (per-period) factor only. Intra-period variation in X (e.g. hourly temperature within a daily period) is collapsed by the period mean and is not captured.

End-to-end walkthrough on the UCI Bike Sharing dataset: Open In Colab

Constants

Name Description
FREQ_TO_PERIOD Maps frequency strings to primary periods
FREQ_TO_PERIODS Maps frequency strings to MDL candidate periods

Design Principles

FLAIR applies the Minimum Description Length principle at every scale:

Scale Mechanism MDL Role
Period P BIC on SVD spectrum + P=1 null Select simplest rank-1 structure or reject periodicity
Rank-1 σ₁ Gavish-Donoho shrinkage Minimax-optimal denoising of the leading singular value
Shape Frozen K-period average Structural (not learned), cannot overfit
Shape₂ BIC-gated shrinkage BIC selects prior: harmonic (2 params) vs flat (0 params)
Ridge α LOOCV soft-average Select model complexity via cross-validation
DoF guard n_train >= 2p Ensures LOOCV leverage stability

Limitations

  • Non-periodic series: the Level × Shape decomposition provides no compression benefit when there is no periodicity (e.g., exchange rates). Use a dedicated non-periodic model instead
  • Intermittent demand: series with >30% zeros are poorly served by the multiplicative structure. Croston-type methods are better suited
  • Coarse exogenous resolution: X_hist / X_future are aggregated to the per-period (Level) timescale via period mean. Intra-period variation in covariates (e.g. hourly weather within a daily period) is dropped by design
  • Short series: fewer than 3 complete periods forces P=1 degeneration (plain Ridge on raw series)

Citation

@misc{flair2026,
  title={FLAIR: Factored Level And Interleaved Ridge for Time Series Forecasting},
  year={2026}
}

License

Apache License 2.0

About

FLAIR: Factored Level And Interleaved Ridge - single-equation time series forecasting

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages