-
-
Landing
-
Dashboard Page
-
Screeners Page for screened opportunities
-
Screened Market Details Page
-
Helix PCA (50)
-
Helix PCA (10)
-
Helix PCA Interpretability Help Modal
-
3D Lattice (10)
-
3D Lattice (50)
-
Top Drivers & Flow Summary Stats
-
Wallet Profile
-
Backtest Page (1)
-
Backtest Page (2)
-
Cutoff Hour Sweep Bar Graphs
-
Backtest: Running a single cutoff hour
-
Alerts Page
Inspiration
Prediction market prices are useful signals, but they don't tell you who is driving a move. A 62% probability could be the consensus of thousands of well-calibrated expert traders, or it could be a handful of noise-traders pushing prices around. We wanted a second signal that asks: "Are the historically accurate wallets buying or selling here?"
The core insight is that not all traders are equal. Polymarket has millions of trades from thousands of wallets, but a small subset of wallets with persistent conviction and low churn have shown that they can dominate the market. If you could identify those wallets and weight their beliefs proportionally, you'd have a signal that anticipates market moves rather than just reflecting them.
What It Does
Precognition is a wallet-weighted probability engine for prediction markets. It ingests trade-level data from Polymarket, profiles every wallet's historical accuracy, infers what each wallet currently believes based on their trading sequence, and synthesizes a single SmartCrowd Probability: a manipulation-aware aggregate that represents what calibrated traders actually think will happen.
Key outputs for every market:
- Precognition Probability vs. Market Probability: Any deviations between smart-money consensus and the market price
- Confidence score: How many trusted wallets are driving this signal
- Disagreement: How split the smart-money cohort is
- Integrity Risk: Manipulation detection from concentration (Herfindahl index) and trade churn patterns
- Cohort Attribution: Which wallet archetypes (timing specialists, informed accumulators, whale conviction) are behind the divergence
- Explanation: A natural language breakdown of what's happening, what would flip the signal, plus live sentiment analysis from news and online sources
How We Built It
- Ingested Polymarket market/trade data (pagination, deduplication by external ID hash)
- Built wallet profiling: Brier score, calibration, churn, persistence, timing edge, specialization, which are sliced by category and horizon
- Inferred wallet beliefs from trade sequences with recency decay and persistence weighting
- Computed shrinkage-blended trust weights with multi-factor style adjustments
- Built manipulation-aware Precognition snapshots with divergence, disagreement, integrity risk, and cohort attribution
- Implemented backtest sweep pipeline and edge bucket analysis
- Designed Two 3D Three.js WebGL visualizations (Helix PCA + 3D Lattice)
- Built screener, market detail, backtest, and alert surfaces in Next.js
- Integrated Gemini and Backboard.io for streaming explanation workflows
- Added live sentiment analysis via Google News + Snowflake Cortex
Polymarket Data Ingestion Pipeline
Polymarket API / CSV
↓
Ingestion (markets, trades, outcomes)
↓
compute_wallet_metrics()
├── Brier score, log loss, calibration error
├── Churn, persistence, timing edge, ROI
└── Sliced by (wallet × category × horizon)
↓
compute_wallet_weights()
└── Shrinkage-blended trust scores
↓
build_snapshots_for_all_markets()
└── Weighted belief aggregation → precognition_prob
↓
API Endpoints → Frontend
Wallet Belief Inference
For each wallet in each market, we infer a belief from their trade sequence using a recency-decayed, persistence-boosted weighted average:
recency = exp(-ln(2) × age_hours / half_life) # default half_life = 48h
direction = +1 if (YES BUY) or (NO SELL), else -1
yes_px = trade_price if side == YES else (1 - trade_price)
vote = (yes_px + 1) / 2 if direction > 0 else yes_px / 2
size_weight = sqrt(trade_size)
persistence_boost = 1.0 + 0.12 × min(consecutive_same_direction - 1, 4)
weight = size_weight × recency × persistence_boost
belief = Σ(weight × vote) / Σ(weight)
Consecutive trades in the same direction get up to a 48% boost, which rewards conviction. The vote mapping converts a market price into an implied belief that the wallet holds about the YES outcome, and the confidence is:
signal_mass = total_weight / (total_weight + 6.0)
sample_support = 0.3 + 0.7 × min(1, trade_count / 6)
persistence = 1 - churn
confidence = signal_mass × sample_support × (0.5 + 0.5 × persistence)
Wallet Trust Weights with Shrinkage
We calculated per-wallet trust weights using a hierarchical Bayesian shrinkage approach:
local_edge = 0.25 - brier_score # baseline random = 0.25
shrinkage = support / (support + prior_strength) # 22 for global, 12 for local
blended_edge = shrinkage × local_edge + (1 - shrinkage) × global_edge
base_weight = clamp(1.0 + blended_edge / 0.25, 0.20, 3.00)
And apply penalties and boosts.
final_weight = clamp(
base_weight
× max(0.45, 1.0 - 0.60 × churn) # penalize flip-floppers
× (0.85 + 0.30 × persistence) # reward conviction
× max(0.50, 1.0 - calibration_error) # penalize overconfidence
× (0.90 + 0.20 × specialization), # reward domain focus
0.10, 4.00
)
So wallets with poor track records or erratic behavior would be downweighted to near-zero.
Precognition Aggregation
The final signal is a confidence-weighted average of trusted beliefs:
$$P_{\text{precognition}}(t) = \frac{\sum_i w_i \cdot c_i \cdot b_i(t)}{\sum_i w_i \cdot c_i}$$
where $b_i$ is the inferred wallet belief, $c_i$ is that wallet's signal confidence, and $w_i$ is their trust weight (adjusted for uncertainty and anti-noise factors).
disagreement = sqrt(Σ shares_i × (belief_i - precognition)²) # RMS deviation
herfindahl = Σ shares_i² # concentration
integrity_risk = 0.55 × herfindahl + 0.45 × avg_churn # manipulation proxy
confidence = signal_support × agreement × wallet_breadth × (1 - 0.70 × integrity_risk)
This makes sure that a market dominated by one big wallet and noisy trading gets a low-confidence, high-integrity-risk signal, even if the probability estimate looks clean.
Cohort Attribution
Wallets are classified:
| Cohort | Criteria |
|---|---|
| Timing Specialist | timing_edge > 0.22, churn < 0.45, ≥5 markets |
| Informed Accumulator | persistence > 0.72, specialization > 0.45 |
| Whale Conviction | avg_trade_size > $200, churn < 0.5 |
| Category Specialist | brier < 0.20, specialization > 0.40 |
| Maker/Arb | churn < 0.35, \ |
| Noise Churner | churn > 0.65 |
For each cohort, we compare their weight share and net contribution to the divergence. This powers the human-readable explanation: "Top sports cohort is reducing YES exposure while the market hasn't repriced yet."
3D Visualizations
Probability Helix (PCA-based)
The helix encodes 14 market features, including price levels, momentum, divergence velocity, confidence signals, net wallet flow, into a PCA embedding that becomes the spine of a 3D double helix.
- Red strand: Market consensus price over time
- Blue strand: Precognition probability over time
- Green "supports": Divergence (brightness = confidence)
- Spine shape: Straight spine = stable regime; curved spine = regime shift (momentum change, divergence acceleration, or confidence spike)
Based on how drastic changes in the direction of the "spine" is, we can directly visualize when
The helix is best for the whys, like detecting regime transitions and market structure evolution.
3D Lattice (Direct Magnitude)
The lattice maps time directly to the X-axis, probability to the Y-axis, and divergence magnitude to the Z-axis (depth), so the two probability strands literally float toward or away from the viewer as divergence widens or narrows.
- X-axis: Time (left → present)
- Y-axis: Probability (up = high)
- Z-axis: Divergence magnitude (forward = precognition exceeds market)
The lattice is best for measuring how wide the divergence is and better visualizing peaks.
Backtesting
We validate our Precognition signal and its power using resolved markets and a sliding-window cutoff:
For each resolved market, we calculate what our signal would have predicted at cutoff_hours before resolution, then compare Brier scores:
Brier(Precognition) vs. Brier(Market)
Improvement % = (market_brier - precognition_brier) / market_brier × 100
The sweep mode runs this from cutoff hours 1 until when there are 0 resolved markets iteratively, and shows spots in the prediction window Precognition adds the most edge. The dge bucket analysis splits predictions by confidence tier and shows win rates per tier (the ideal pattern is a monotonically increasing win rate as confidence rises).
Features
- Market Screener: Ranked table of all markets by divergence with confidence, category, and watchlist
- Market Detail: Full signal breakdown of probabilities, confidence, integrity risk, top wallet drivers, flow summary, and divergence explainer
- Probability Helix: PCA-based 3D double helix showing regime evolution
- 3D Lattice: Direct magnitude visualization of divergence over time
- Backtesting: Brier score sweep across cutoff horizons with edge bucket analysis
- Alerts: Configurable triggers for regime shifts, signal crossovers, and integrity risk spikes
- AI Explanations: Gemini + Backboard.io for plain-language divergence breakdowns
- News Sentiment: Live Google News headlines analyzed via Snowflake Cortex
Challenges We Ran Into
- API pagination and stale closed-market sampling: Polymarket's API returns markets in various states; needed careful filtering to avoid polluting the training data
- Sparse/noisy wallet histories: Most wallets have very few trades; shrinkage blending prevents overfitting to small samples
- Signal quality on low-liquidity markets: Few wallets means high Herfindahl concentration; integrity risk gating prevents false confidence
- Frontend/backend schema mismatches and chart readability at small % moves
- Computing reliable historical time series: Only one snapshot per pipeline run initially; needed backfill logic to reconstruct 50+ historical points per market from trade timestamps
What We Learned
- Forecasting improves when trader quality is modeled conditionally (not globally)
- Shrinkage and confidence gating matter more than raw ROI
- Product clarity (explanations + confidence + integrity signals) is as important as raw accuracy
Next Steps
- Improve visualization and use methods like Markov chain modelling to visualize and zero in on leading wallets
- Agent optimizations for diversifying investments
- Adding vector retrieval and semantic caching for sentiment analysis to make agents more accurate and robust
- Creating a better data pipeline to ingest data more efficiently
Built With
- backboard.io
- fastapi
- gemini
- pca
- polymarket
- python
- rechart.js
- snowflake
- sqlite
- tailwind
- tanstack-query
- three.js
- typescript
Log in or sign up for Devpost to join the conversation.