A (Bayesian) reanalysis of correlations between nuclear tests, UAP (Unidentified Aerial Phenomena) sightings, and photographic plate transients during the Cold War era. This project extends the work originally published in Scientific Reports (DOI 10.1038/s41598-025-21620-3).
The original study found:
- Transients 45% more likely within ±1 day of nuclear tests (p = .008)
- Each additional UAP report correlates with 8.5% rise in transients on transient days (p = .015)
- Nuclear tests show small but significant links to UAP counts (p = .008)
This reanalysis uses Bayesian hierarchical models to:
- Model actual transient counts (not just binary occurrence)
- Incorporate temporal autocorrelation via latent random walks
- Use improved center-of-plate data with edge artifacts excluded
In addition, the counts of UAP sightings appear in these models as linear covariates after the dampening (concave) transformation
The latent random walk is an effort to control for temporal clustering, which would artificially inflate p-values of the original analysis. This mechanism does not control for more complex, nonlocal forms of temporal dependencies, such as calendar-related effects, although initial analyses did not find obvious weekday effects.
... as file names appear in scripts. Note that the data is not included in this repository:
-
Transient_CENTER_of_PLATE_FULL_DATASET_DETAILED.xlsx— "New data", used to augment the data set from the original paper, and used in the main model reported below. The new thing here is transient counts from plate centers only, with corner/edge artifacts excluded for improved reliability. 306 days with transients recorded (non-zero days only). -
Transient_Nuclear_Analyzed_Dataset_ScientificReports.xlsx— Data of the original publication, used in some earlier models. Includes all days (with zeros). Contains nuclear test and UAP predictor data. Date range: 1949-11-19 to 1957-04-28. -
counts-data.parquet— Processed, merged dataset with ±3 day lagged predictors, generated byconvert_counts.R. 2718 days total, ~89% zeros.
The primary analysis uses a hurdle model where both the probability of any transients and the count magnitude share a common latent predictor. See hurdle_model.md for full mathematical specification.
Key features:
- Shared latent predictor: Covariate effects (nuclear tests, UAP reports at ±3 day lags) influence both occurrence probability and count magnitude through a single mechanism
- Random walk component: Captures temporal autocorrelation in underlying "activity state" beyond covariate effects
- Zero-truncated negative binomial: Handles the substantial overdispersion in non-zero counts (variance/mean ≈ 57)
| File | Description |
|---|---|
hurdle_negbin_shared_latent.stan |
Main Stan model with random walk |
hurdle_negbin_shared.stan |
Simpler variant without random walk |
count_model.R |
Data preparation, model fitting, diagnostics |
convert_counts.R |
Creates counts-data.parquet with lagged predictors |
hurdle_model.md |
Mathematical specification |
notes.md |
Data properties and modeling notes |
Model outputs are saved to results/:
| File | Description |
|---|---|
parameter_pvalues.md |
Posterior summaries with Bayesian p-values for all lag coefficients |
hurdle_latent_coefficients.png |
Coefficient intervals for nuclear/UAP effects at each lag |
hurdle_latent_structural.png |
Structural parameters (coupling, dispersion, RW scale) |
latent_rw_trajectory.png |
Estimated latent random walk over time |
hurdle_latent_trace.png |
MCMC trace plots for convergence diagnostics |
Additional plots for the simpler (no random walk) model and distribution diagnostics are also saved.
From the hurdle model with latent random walk (full table):
| Predictor | Lag | Mean Effect | Bayesian p-value |
|---|---|---|---|
| Nuclear test | -1 day | +0.72 | 0.037 |
| UAP reports | -2 days | +0.59 | <0.001 |
| UAP reports | 0 days | +0.32 | 0.048 |
- Nuclear tests 1 day before show a significant positive association with transient occurrence (p = 0.037). Same-day and +1 day effects are in the expected direction but not significant.
- UAP reports 2 days before show a strong positive association (p < 0.001, posterior probability 100% positive across 2000 draws).
- Same-day UAP shows a marginally significant effect (p = 0.048).
- Coupling (a ≈ 0.1): The coupling between the shared latent predictor and count magnitude is small, with credible interval barely excluding zero. This suggests covariates primarily influence whether transients occur, not how many — the count magnitude on transient days is largely independent of nuclear/UAP activity.
- Random walk scale (σ_L ≈ 0.8): Substantial temporal autocorrelation exists beyond what covariates explain, justifying the latent random walk component.
- Dispersion (φ ≈ 1.5): Confirms overdispersion in counts relative to Poisson.
The reanalysis broadly supports the original findings. The effects appear to operate on occurrence probability rather than intensity (counts). When transients do occur, their count is driven by other factors (captured partly by the random walk) rather than nuclear/UAP activity. If the effects themselves are real, the irrelevance of counts can, for example, be a natural result of noise on the plates or from the transient detection algorithm.
My informal, general feeling after all these analyses is that the p-values are not easily erased by model details; obviously, when one adds parameters to a model, its power gradually becomes weaker at separating individual coefficients from zero, and significances wane.
In the particular model reported here, the UAP coefficient comes with a strong p-value, while the nuclear test coefficients are harder to distinguish from zero. This is, however, opposite to my overall impression. Nuclear at T-1 appears consistently and often with a decent p-value, while UAP coefficients are somewhat flaky and raise suspicion of complex, unknown confounding.
Earlier analyses modeled binary transient occurrence rather than counts, using the original dataset:
| File | Description |
|---|---|
model.R |
Exploratory brms logistic regression |
latent1.stan, latent.R |
Latent state-space model with impulse responses |
lagged_transient_model.stan |
Hierarchical Student-t lag model |
lagged_transient_model_rhs.stan |
Regularized horseshoe variant |
latent_model.md |
Mathematical specification |
These models use uap-data-small.parquet generated by convert.R.
Exploratory count models before settling on the hurdle approach:
hurdle_negbin.stan— Hurdle model with separate (non-shared) predictors for hurdle and count componentspoisson_lognormal.stan— Poisson-lognormal mixturepoisson_lognormal_sp.stan— Poisson-lognormal with spatial/temporal structure
- R packages: tidyverse, arrow, cmdstanr, brms, bayesplot, ggplot2, posterior, knitr
- Stan: Models compiled via cmdstanr; requires CmdStan installed (typically at
~/cmdstan)
# Generate count data with ±3 day lag window
source("convert_counts.R")
# Run interactively in count_model.R for:
# - Distribution exploration
# - Model fitting (takes several minutes per model)
# - Diagnostics and plotsSee count_model.R for detailed fitting code.
This work is licensed under CC BY 4.0. You are free to share and adapt the material for any purpose, provided you give appropriate attribution.