Bayesian Reanalysis of Photographic Plate Transients

A (Bayesian) reanalysis of correlations between nuclear tests, UAP (Unidentified Aerial Phenomena) sightings, and photographic plate transients during the Cold War era. This project extends the work originally published in Scientific Reports (DOI 10.1038/s41598-025-21620-3).

Overview

The original study found:

Transients 45% more likely within ±1 day of nuclear tests (p = .008)
Each additional UAP report correlates with 8.5% rise in transients on transient days (p = .015)
Nuclear tests show small but significant links to UAP counts (p = .008)

This reanalysis uses Bayesian hierarchical models to:

Model actual transient counts (not just binary occurrence)
Incorporate temporal autocorrelation via latent random walks
Use improved center-of-plate data with edge artifacts excluded

In addition, the counts of UAP sightings appear in these models as linear covariates after the dampening (concave) transformation $\log(n(\textrm{UAP}) + 1)$.

The latent random walk is an effort to control for temporal clustering, which would artificially inflate p-values of the original analysis. This mechanism does not control for more complex, nonlocal forms of temporal dependencies, such as calendar-related effects, although initial analyses did not find obvious weekday effects.

Data

... as file names appear in scripts. Note that the data is not included in this repository:

Transient_CENTER_of_PLATE_FULL_DATASET_DETAILED.xlsx — "New data", used to augment the data set from the original paper, and used in the main model reported below. The new thing here is transient counts from plate centers only, with corner/edge artifacts excluded for improved reliability. 306 days with transients recorded (non-zero days only).
Transient_Nuclear_Analyzed_Dataset_ScientificReports.xlsx — Data of the original publication, used in some earlier models. Includes all days (with zeros). Contains nuclear test and UAP predictor data. Date range: 1949-11-19 to 1957-04-28.
counts-data.parquet — Processed, merged dataset with ±3 day lagged predictors, generated by convert_counts.R. 2718 days total, ~89% zeros.

Main Model: Hurdle Negative Binomial with Shared Latent Structure

The primary analysis uses a hurdle model where both the probability of any transients and the count magnitude share a common latent predictor. See hurdle_model.md for full mathematical specification.

Key features:

Shared latent predictor: Covariate effects (nuclear tests, UAP reports at ±3 day lags) influence both occurrence probability and count magnitude through a single mechanism
Random walk component: Captures temporal autocorrelation in underlying "activity state" beyond covariate effects
Zero-truncated negative binomial: Handles the substantial overdispersion in non-zero counts (variance/mean ≈ 57)

Files

File	Description
`hurdle_negbin_shared_latent.stan`	Main Stan model with random walk
`hurdle_negbin_shared.stan`	Simpler variant without random walk
`count_model.R`	Data preparation, model fitting, diagnostics
`convert_counts.R`	Creates `counts-data.parquet` with lagged predictors
`hurdle_model.md`	Mathematical specification
`notes.md`	Data properties and modeling notes

Results

Model outputs are saved to results/:

File	Description
`parameter_pvalues.md`	Posterior summaries with Bayesian p-values for all lag coefficients
`hurdle_latent_coefficients.png`	Coefficient intervals for nuclear/UAP effects at each lag
`hurdle_latent_structural.png`	Structural parameters (coupling, dispersion, RW scale)
`latent_rw_trajectory.png`	Estimated latent random walk over time
`hurdle_latent_trace.png`	MCMC trace plots for convergence diagnostics

Additional plots for the simpler (no random walk) model and distribution diagnostics are also saved.

Key Findings

Significant Lagged Effects

From the hurdle model with latent random walk (full table):

Predictor	Lag	Mean Effect	Bayesian p-value
Nuclear test	-1 day	+0.72	0.037
UAP reports	-2 days	+0.59	<0.001
UAP reports	0 days	+0.32	0.048

Nuclear tests 1 day before show a significant positive association with transient occurrence (p = 0.037). Same-day and +1 day effects are in the expected direction but not significant.
UAP reports 2 days before show a strong positive association (p < 0.001, posterior probability 100% positive across 2000 draws).
Same-day UAP shows a marginally significant effect (p = 0.048).

Structural Parameters

Coupling (a ≈ 0.1): The coupling between the shared latent predictor and count magnitude is small, with credible interval barely excluding zero. This suggests covariates primarily influence whether transients occur, not how many — the count magnitude on transient days is largely independent of nuclear/UAP activity.
Random walk scale (σ_L ≈ 0.8): Substantial temporal autocorrelation exists beyond what covariates explain, justifying the latent random walk component.
Dispersion (φ ≈ 1.5): Confirms overdispersion in counts relative to Poisson.

Interpretation, and hunches

The reanalysis broadly supports the original findings. The effects appear to operate on occurrence probability rather than intensity (counts). When transients do occur, their count is driven by other factors (captured partly by the random walk) rather than nuclear/UAP activity. If the effects themselves are real, the irrelevance of counts can, for example, be a natural result of noise on the plates or from the transient detection algorithm.

My informal, general feeling after all these analyses is that the p-values are not easily erased by model details; obviously, when one adds parameters to a model, its power gradually becomes weaker at separating individual coefficients from zero, and significances wane.

In the particular model reported here, the UAP coefficient comes with a strong p-value, while the nuclear test coefficients are harder to distinguish from zero. This is, however, opposite to my overall impression. Nuclear at T-1 appears consistently and often with a decent p-value, while UAP coefficients are somewhat flaky and raise suspicion of complex, unknown confounding.

Earlier Work

Logistic Models (`logistic_models/`)

Earlier analyses modeled binary transient occurrence rather than counts, using the original dataset:

File	Description
`model.R`	Exploratory brms logistic regression
`latent1.stan`, `latent.R`	Latent state-space model with impulse responses
`lagged_transient_model.stan`	Hierarchical Student-t lag model
`lagged_transient_model_rhs.stan`	Regularized horseshoe variant
`latent_model.md`	Mathematical specification

These models use uap-data-small.parquet generated by convert.R.

Earlier Count Model Efforts (`earlier_count_model_efforts/`)

Exploratory count models before settling on the hurdle approach:

hurdle_negbin.stan — Hurdle model with separate (non-shared) predictors for hurdle and count components
poisson_lognormal.stan — Poisson-lognormal mixture
poisson_lognormal_sp.stan — Poisson-lognormal with spatial/temporal structure

Technical Requirements

R packages: tidyverse, arrow, cmdstanr, brms, bayesplot, ggplot2, posterior, knitr
Stan: Models compiled via cmdstanr; requires CmdStan installed (typically at ~/cmdstan)

Running the Analysis

# Generate count data with ±3 day lag window
source("convert_counts.R")

# Run interactively in count_model.R for:
# - Distribution exploration
# - Model fitting (takes several minutes per model)
# - Diagnostics and plots

See count_model.R for detailed fitting code.

License

This work is licensed under CC BY 4.0. You are free to share and adapt the material for any purpose, provided you give appropriate attribution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bayesian Reanalysis of Photographic Plate Transients

Overview

Data

Main Model: Hurdle Negative Binomial with Shared Latent Structure

Files

Results

Key Findings

Significant Lagged Effects

Structural Parameters

Interpretation, and hunches

Earlier Work

Logistic Models (`logistic_models/`)

Earlier Count Model Efforts (`earlier_count_model_efforts/`)

Technical Requirements

Running the Analysis

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
earlier_count_model_efforts		earlier_count_model_efforts
logistic_models		logistic_models
results		results
.gitignore		.gitignore
README.md		README.md
convert_counts.R		convert_counts.R
count_model.R		count_model.R
hurdle_model.md		hurdle_model.md
hurdle_negbin_shared.stan		hurdle_negbin_shared.stan
hurdle_negbin_shared_latent.stan		hurdle_negbin_shared_latent.stan
notes.md		notes.md

Folders and files

Latest commit

History

Repository files navigation

Bayesian Reanalysis of Photographic Plate Transients

Overview

Data

Main Model: Hurdle Negative Binomial with Shared Latent Structure

Files

Results

Key Findings

Significant Lagged Effects

Structural Parameters

Interpretation, and hunches

Earlier Work

Logistic Models (logistic_models/)

Earlier Count Model Efforts (earlier_count_model_efforts/)

Technical Requirements

Running the Analysis

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Logistic Models (`logistic_models/`)

Earlier Count Model Efforts (`earlier_count_model_efforts/`)

Packages