An open-source pipeline that takes raw time-series data (CSV/JSON) and discovers governing equations using SINDy, FFT, power-law fitting, and change-point detection. Built on PySINDy and standard scientific Python.
This is not a new algorithm — it's a reproducible workflow that automates preprocessing, multi-method routing, and interpretation across many domains. All results are rediscoveries of known laws from public datasets.
| # | Dataset | Source | Discovery | Metric |
|---|---|---|---|---|
| E061 | Turbofan engines | NASA C-MAPSS | Degradation laws (Ps30²) | R²=0.38/engine |
| E062 | Exoplanets | NASA Archive | Kepler's Third Law | R²=0.998 |
| E063 | Fireballs | NASA CNEOS | Luminous efficiency | τ=8.2% |
| E064 | Voyager 1 | NASA SPDF* | Heliopause crossing | p=3.3e-20 |
| E065 | Sunspots | SILSO | 11.09-year solar cycle | FFT exact |
| E066 | Gravitational Waves | GWTC | Chirp mass formula | R²=0.998 |
| E067 | Asteroids | JPL SBDB | 5/5 Kirkwood gaps | R²=0.99995 |
| E068 | Mars Weather | MSL REMS | CO₂ pressure cycle | 22% variation |
| E069 | Hubble's Law | NED-D | Universe expanding | H₀=69.7 |
| E070 | JWST Galaxies | UNCOVER DR3 | Size evolution | 1,042 at z>10 |
| E071 | Dark Matter | SPARC | Flat rotation curves | 94% flat, 57% DM |
| E072 | TESS Transits | NASA Archive | Transit depth law | R²=0.85 |
| E074 | Dark Energy | Pantheon+ | Accelerating expansion | Ω_Λ=0.651 |
| E079 | CERN Dimuon | CERN CMS | Z boson + J/ψ | M_Z=90.9 GeV |
| E080 | Arctic Ice | NSIDC | Linear decline | -0.76M km²/decade |
| E082 | Inequality | World Bank | Pareto law | α=1.91 |
| E090 | Selection by Inferability | Simulation | Phase transitions in discoverability | width=0.015 |
| E091 | Riemann Gap Repulsion | mpmath | Level repulsion r=-0.354 | 22K zeros |
| — | Bitcoin | — | No law found | R²=0.00 |
*E064 uses realistic generated data matching published Voyager 1 characteristics.
git clone https://github.com/SaulVanCode/protoscience-nasa-experiments.git
cd protoscience-nasa-experiments
pip install -r requirements.txt
jupyter notebook notebooks/Or run any notebook directly in Google Colab (no install needed) — click the Colab badge at the top of each notebook.
The interpreter/ directory contains an LLM-based agent that takes discovered equations and generates plain-language explanations, physical analogies, and testable predictions. See interpreter/README.md for usage.
- No methodological novelty — this is PySINDy + FFT + fitting, well-packaged
- Only rediscoveries — no new scientific insights, only recovery of known laws
- Favorable benchmarks — datasets chosen because they have known compact equations
- No formal comparison against PySINDy, PySR, or AI Feynman baselines
- No uncertainty quantification on discovered coefficients
- LLM interpreter may confabulate — its output is narrative, not verified math
All data from official public sources:
- NASA Exoplanet Archive | NASA CNEOS | SILSO | GWOSC | JPL SBDB | MSL REMS | NED-D | JWST UNCOVER | SPARC | Pantheon+ | CERN Open Data | NSIDC | World Bank
The pipeline combines multiple discovery methods:
- SINDy (Brunton et al., 2016) — sparse regression over candidate function libraries for differential equations
- FFT — periodic signal detection
- Power-law / curve fitting — algebraic relationships
- Change-point detection — phase transitions and regime shifts
It does not advance the algorithmic state-of-the-art. The contribution is integration, automation, and reproducibility.
A draft paper is in paper/protoscience_paper.md. Feedback welcome.
MIT