Prec0gnition

Landing
Dashboard Page
Screeners Page for screened opportunities
Screened Market Details Page
Helix PCA (50)
Helix PCA (10)
Helix PCA Interpretability Help Modal
3D Lattice (10)
3D Lattice (50)
Top Drivers & Flow Summary Stats
Wallet Profile
Backtest Page (1)
Backtest Page (2)
Cutoff Hour Sweep Bar Graphs
Backtest: Running a single cutoff hour
Alerts Page

Inspiration

Prediction market prices are useful signals, but they don't tell you who is driving a move. A 62% probability could be the consensus of thousands of well-calibrated expert traders, or it could be a handful of noise-traders pushing prices around. We wanted a second signal that asks: "Are the historically accurate wallets buying or selling here?"

The core insight is that not all traders are equal. Polymarket has millions of trades from thousands of wallets, but a small subset of wallets with persistent conviction and low churn have shown that they can dominate the market. If you could identify those wallets and weight their beliefs proportionally, you'd have a signal that anticipates market moves rather than just reflecting them.

What It Does

Precognition is a wallet-weighted probability engine for prediction markets. It ingests trade-level data from Polymarket, profiles every wallet's historical accuracy, infers what each wallet currently believes based on their trading sequence, and synthesizes a single SmartCrowd Probability: a manipulation-aware aggregate that represents what calibrated traders actually think will happen.

Key outputs for every market:

Precognition Probability vs. Market Probability: Any deviations between smart-money consensus and the market price
Confidence score: How many trusted wallets are driving this signal
Disagreement: How split the smart-money cohort is
Integrity Risk: Manipulation detection from concentration (Herfindahl index) and trade churn patterns
Cohort Attribution: Which wallet archetypes (timing specialists, informed accumulators, whale conviction) are behind the divergence
Explanation: A natural language breakdown of what's happening, what would flip the signal, plus live sentiment analysis from news and online sources

How We Built It

Ingested Polymarket market/trade data (pagination, deduplication by external ID hash)
Built wallet profiling: Brier score, calibration, churn, persistence, timing edge, specialization, which are sliced by category and horizon
Inferred wallet beliefs from trade sequences with recency decay and persistence weighting
Computed shrinkage-blended trust weights with multi-factor style adjustments
Built manipulation-aware Precognition snapshots with divergence, disagreement, integrity risk, and cohort attribution
Implemented backtest sweep pipeline and edge bucket analysis
Designed Two 3D Three.js WebGL visualizations (Helix PCA + 3D Lattice)
Built screener, market detail, backtest, and alert surfaces in Next.js
Integrated Gemini and Backboard.io for streaming explanation workflows
Added live sentiment analysis via Google News + Snowflake Cortex

Polymarket Data Ingestion Pipeline

Polymarket API / CSV
        ↓
   Ingestion (markets, trades, outcomes)
        ↓
   compute_wallet_metrics()
   ├── Brier score, log loss, calibration error
   ├── Churn, persistence, timing edge, ROI
   └── Sliced by (wallet × category × horizon)
        ↓
   compute_wallet_weights()
   └── Shrinkage-blended trust scores
        ↓
   build_snapshots_for_all_markets()
   └── Weighted belief aggregation → precognition_prob
        ↓
   API Endpoints → Frontend

Wallet Belief Inference

For each wallet in each market, we infer a belief from their trade sequence using a recency-decayed, persistence-boosted weighted average:

recency = exp(-ln(2) × age_hours / half_life)   # default half_life = 48h

direction = +1 if (YES BUY) or (NO SELL), else -1
yes_px = trade_price if side == YES else (1 - trade_price)
vote = (yes_px + 1) / 2 if direction > 0 else yes_px / 2

size_weight = sqrt(trade_size)
persistence_boost = 1.0 + 0.12 × min(consecutive_same_direction - 1, 4)
weight = size_weight × recency × persistence_boost

belief = Σ(weight × vote) / Σ(weight)

Consecutive trades in the same direction get up to a 48% boost, which rewards conviction. The vote mapping converts a market price into an implied belief that the wallet holds about the YES outcome, and the confidence is:

signal_mass = total_weight / (total_weight + 6.0)
sample_support = 0.3 + 0.7 × min(1, trade_count / 6)
persistence = 1 - churn
confidence = signal_mass × sample_support × (0.5 + 0.5 × persistence)

Wallet Trust Weights with Shrinkage

We calculated per-wallet trust weights using a hierarchical Bayesian shrinkage approach:

local_edge = 0.25 - brier_score     # baseline random = 0.25
shrinkage = support / (support + prior_strength)   # 22 for global, 12 for local
blended_edge = shrinkage × local_edge + (1 - shrinkage) × global_edge

base_weight = clamp(1.0 + blended_edge / 0.25, 0.20, 3.00)

And apply penalties and boosts.

final_weight = clamp(
    base_weight
    × max(0.45, 1.0 - 0.60 × churn)           # penalize flip-floppers
    × (0.85 + 0.30 × persistence)              # reward conviction
    × max(0.50, 1.0 - calibration_error)       # penalize overconfidence
    × (0.90 + 0.20 × specialization),          # reward domain focus
    0.10, 4.00
)

So wallets with poor track records or erratic behavior would be downweighted to near-zero.

Precognition Aggregation

The final signal is a confidence-weighted average of trusted beliefs:

$$P_{\text{precognition}}(t) = \frac{\sum_i w_i \cdot c_i \cdot b_i(t)}{\sum_i w_i \cdot c_i}$$

where $b_i$ is the inferred wallet belief, $c_i$ is that wallet's signal confidence, and $w_i$ is their trust weight (adjusted for uncertainty and anti-noise factors).

disagreement = sqrt(Σ shares_i × (belief_i - precognition)²)    # RMS deviation
herfindahl = Σ shares_i²                                         # concentration
integrity_risk = 0.55 × herfindahl + 0.45 × avg_churn           # manipulation proxy
confidence = signal_support × agreement × wallet_breadth × (1 - 0.70 × integrity_risk)

This makes sure that a market dominated by one big wallet and noisy trading gets a low-confidence, high-integrity-risk signal, even if the probability estimate looks clean.

Cohort Attribution

Wallets are classified:

Cohort	Criteria
Timing Specialist	timing_edge > 0.22, churn < 0.45, ≥5 markets
Informed Accumulator	persistence > 0.72, specialization > 0.45
Whale Conviction	avg_trade_size > $200, churn < 0.5
Category Specialist	brier < 0.20, specialization > 0.40
Maker/Arb	churn < 0.35, \
Noise Churner	churn > 0.65

For each cohort, we compare their weight share and net contribution to the divergence. This powers the human-readable explanation: "Top sports cohort is reducing YES exposure while the market hasn't repriced yet."

3D Visualizations

Probability Helix (PCA-based)

The helix encodes 14 market features, including price levels, momentum, divergence velocity, confidence signals, net wallet flow, into a PCA embedding that becomes the spine of a 3D double helix.

Red strand: Market consensus price over time
Blue strand: Precognition probability over time
Green "supports": Divergence (brightness = confidence)
Spine shape: Straight spine = stable regime; curved spine = regime shift (momentum change, divergence acceleration, or confidence spike)

Based on how drastic changes in the direction of the "spine" is, we can directly visualize when

The helix is best for the whys, like detecting regime transitions and market structure evolution.

3D Lattice (Direct Magnitude)

The lattice maps time directly to the X-axis, probability to the Y-axis, and divergence magnitude to the Z-axis (depth), so the two probability strands literally float toward or away from the viewer as divergence widens or narrows.

X-axis: Time (left → present)
Y-axis: Probability (up = high)
Z-axis: Divergence magnitude (forward = precognition exceeds market)

The lattice is best for measuring how wide the divergence is and better visualizing peaks.

Backtesting

We validate our Precognition signal and its power using resolved markets and a sliding-window cutoff:

For each resolved market, we calculate what our signal would have predicted at cutoff_hours before resolution, then compare Brier scores:

Brier(Precognition) vs. Brier(Market)
Improvement % = (market_brier - precognition_brier) / market_brier × 100

The sweep mode runs this from cutoff hours 1 until when there are 0 resolved markets iteratively, and shows spots in the prediction window Precognition adds the most edge. The dge bucket analysis splits predictions by confidence tier and shows win rates per tier (the ideal pattern is a monotonically increasing win rate as confidence rises).

Features

Market Screener: Ranked table of all markets by divergence with confidence, category, and watchlist
Market Detail: Full signal breakdown of probabilities, confidence, integrity risk, top wallet drivers, flow summary, and divergence explainer
Probability Helix: PCA-based 3D double helix showing regime evolution
3D Lattice: Direct magnitude visualization of divergence over time
Backtesting: Brier score sweep across cutoff horizons with edge bucket analysis
Alerts: Configurable triggers for regime shifts, signal crossovers, and integrity risk spikes
AI Explanations: Gemini + Backboard.io for plain-language divergence breakdowns
News Sentiment: Live Google News headlines analyzed via Snowflake Cortex

Challenges We Ran Into

API pagination and stale closed-market sampling: Polymarket's API returns markets in various states; needed careful filtering to avoid polluting the training data
Sparse/noisy wallet histories: Most wallets have very few trades; shrinkage blending prevents overfitting to small samples
Signal quality on low-liquidity markets: Few wallets means high Herfindahl concentration; integrity risk gating prevents false confidence
Frontend/backend schema mismatches and chart readability at small % moves
Computing reliable historical time series: Only one snapshot per pipeline run initially; needed backfill logic to reconstruct 50+ historical points per market from trade timestamps

What We Learned

Forecasting improves when trader quality is modeled conditionally (not globally)
Shrinkage and confidence gating matter more than raw ROI
Product clarity (explanations + confidence + integrity signals) is as important as raw accuracy

Next Steps

Improve visualization and use methods like Markov chain modelling to visualize and zero in on leading wallets
Agent optimizations for diversifying investments
Adding vector retrieval and semantic caching for sentiment analysis to make agents more accurate and robust
Creating a better data pipeline to ingest data more efficiently