Skip to content

SaulVanCode/protoscience-nasa-experiments

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ProtoScience: Automated Equation Discovery from Public Data

An open-source pipeline that takes raw time-series data (CSV/JSON) and discovers governing equations using SINDy, FFT, power-law fitting, and change-point detection. Built on PySINDy and standard scientific Python.

This is not a new algorithm — it's a reproducible workflow that automates preprocessing, multi-method routing, and interpretation across many domains. All results are rediscoveries of known laws from public datasets.

Results

# Dataset Source Discovery Metric
E061 Turbofan engines NASA C-MAPSS Degradation laws (Ps30²) R²=0.38/engine
E062 Exoplanets NASA Archive Kepler's Third Law R²=0.998
E063 Fireballs NASA CNEOS Luminous efficiency τ=8.2%
E064 Voyager 1 NASA SPDF* Heliopause crossing p=3.3e-20
E065 Sunspots SILSO 11.09-year solar cycle FFT exact
E066 Gravitational Waves GWTC Chirp mass formula R²=0.998
E067 Asteroids JPL SBDB 5/5 Kirkwood gaps R²=0.99995
E068 Mars Weather MSL REMS CO₂ pressure cycle 22% variation
E069 Hubble's Law NED-D Universe expanding H₀=69.7
E070 JWST Galaxies UNCOVER DR3 Size evolution 1,042 at z>10
E071 Dark Matter SPARC Flat rotation curves 94% flat, 57% DM
E072 TESS Transits NASA Archive Transit depth law R²=0.85
E074 Dark Energy Pantheon+ Accelerating expansion Ω_Λ=0.651
E079 CERN Dimuon CERN CMS Z boson + J/ψ M_Z=90.9 GeV
E080 Arctic Ice NSIDC Linear decline -0.76M km²/decade
E082 Inequality World Bank Pareto law α=1.91
E090 Selection by Inferability Simulation Phase transitions in discoverability width=0.015
E091 Riemann Gap Repulsion mpmath Level repulsion r=-0.354 22K zeros
Bitcoin No law found R²=0.00

*E064 uses realistic generated data matching published Voyager 1 characteristics.

Quick Start

git clone https://github.com/SaulVanCode/protoscience-nasa-experiments.git
cd protoscience-nasa-experiments
pip install -r requirements.txt
jupyter notebook notebooks/

Or run any notebook directly in Google Colab (no install needed) — click the Colab badge at the top of each notebook.

LLM Interpreter

The interpreter/ directory contains an LLM-based agent that takes discovered equations and generates plain-language explanations, physical analogies, and testable predictions. See interpreter/README.md for usage.

Limitations

  • No methodological novelty — this is PySINDy + FFT + fitting, well-packaged
  • Only rediscoveries — no new scientific insights, only recovery of known laws
  • Favorable benchmarks — datasets chosen because they have known compact equations
  • No formal comparison against PySINDy, PySR, or AI Feynman baselines
  • No uncertainty quantification on discovered coefficients
  • LLM interpreter may confabulate — its output is narrative, not verified math

Data Sources

All data from official public sources:

How It Works

The pipeline combines multiple discovery methods:

  1. SINDy (Brunton et al., 2016) — sparse regression over candidate function libraries for differential equations
  2. FFT — periodic signal detection
  3. Power-law / curve fitting — algebraic relationships
  4. Change-point detection — phase transitions and regime shifts

It does not advance the algorithmic state-of-the-art. The contribution is integration, automation, and reproducibility.

Paper

A draft paper is in paper/protoscience_paper.md. Feedback welcome.

License

MIT

About

Can an AI rediscover physics from raw NASA data? 6 reproducible experiments with Jupyter notebooks.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors