Periodically fetches all active (unexpired) events and markets from the Polymarket Gamma API, stores them in a SQLite database, and computes implied probabilities for every tradable market.
- Paginated fetching of all active, non-closed events via
GET /events?active=true&closed=false - Upserts into SQLite using
INSERT ... ON CONFLICT— no duplicate rows, existing records are updated in place - Stores key fields as typed columns for fast queries, plus a
raw_dataTEXT column preserving the full API payload as JSON - Computes implied probability of "Yes" for every tradable market from the outcome prices returned by the Gamma API
- Builds a time-series
market_probabilitiestable — one snapshot per market per scan cycle - Backfills any new markets that appear between scan cycles before computing probabilities
- Configurable scan interval via environment variable
- Single-scan mode with
--onceflag - Graceful shutdown on
SIGINT/SIGTERM - Auto-reconnects to the database on connection failures
pip install -r event_scanner/requirements.txtNo setup needed — the SQLite database file (polymarket.db) is created automatically on first run.
Copy the example and edit with your credentials:
cp event_scanner/.env.example event_scanner/.env| Variable | Default | Description |
|---|---|---|
DB_PATH |
polymarket.db |
Path to SQLite database file |
SCAN_INTERVAL_SECONDS |
300 |
Time between scans (seconds) |
cd polymarketer
# Periodic scanning (default: every 5 minutes)
python -m event_scanner
# Single scan and exit
python -m event_scanner --onceEach cycle performs three steps:
- Fetch & upsert — pulls all active, non-closed events and their markets from the Gamma API and upserts them into the
eventsandmarketstables. - Backfill — checks for any new markets not yet in the database and inserts them.
- Compute probabilities — for every tradable market (
accepting_orders = true), extracts the Yes token price fromoutcome_pricesas the implied probability and appends a snapshot tomarket_probabilities.
| Column | Type | Description |
|---|---|---|
id |
TEXT PK |
Polymarket event ID |
slug |
TEXT |
URL slug |
title |
TEXT |
Event title |
description |
TEXT |
Full description |
start_date |
TEXT |
Event start (ISO 8601) |
end_date |
TEXT |
Event end / expiry (ISO 8601) |
active |
INTEGER |
Currently active (0/1) |
closed |
INTEGER |
Resolved / closed (0/1) |
volume |
REAL |
Total volume traded |
liquidity |
REAL |
Current liquidity |
tags |
TEXT |
Category tags (JSON) |
raw_data |
TEXT |
Complete API response (JSON) |
synced_at |
TEXT |
Last sync timestamp (ISO 8601) |
| Column | Type | Description |
|---|---|---|
id |
TEXT PK |
Polymarket market ID |
event_id |
TEXT FK |
Parent event |
question |
TEXT |
Market question |
outcomes |
TEXT |
e.g. ["Yes", "No"] (JSON) |
outcome_prices |
TEXT |
Current prices (JSON) |
volume |
REAL |
Volume traded |
best_bid |
REAL |
Best bid price |
best_ask |
REAL |
Best ask price |
last_trade_price |
REAL |
Last trade price |
clob_token_ids |
TEXT |
CLOB token identifiers (JSON) |
raw_data |
TEXT |
Complete API response (JSON) |
synced_at |
TEXT |
Last sync timestamp (ISO 8601) |
(Some columns omitted for brevity — see db.py for the full schema.)
Time-series table — one row appended per market per scan cycle.
| Column | Type | Description |
|---|---|---|
id |
INTEGER PK |
Auto-increment |
market_id |
TEXT FK |
References markets(id) |
question |
TEXT |
Market question |
implied_probability |
REAL |
Yes token price (0.0--1.0) |
best_bid |
REAL |
Best bid at snapshot time |
best_ask |
REAL |
Best ask at snapshot time |
last_trade_price |
REAL |
Last trade at snapshot time |
spread |
REAL |
Bid-ask spread |
volume |
REAL |
Total volume traded |
accepting_orders |
INTEGER |
Currently tradable (0/1) |
recorded_at |
TEXT |
Snapshot timestamp (ISO 8601) |
Indexed on market_id and recorded_at for efficient time-series queries.
Every Polymarket market is a binary question (e.g. "Will X happen?") with two outcome tokens: Yes and No. These tokens trade between $0.00 and $1.00. When the market resolves, the winning token pays out $1.00 and the losing token pays $0.00.
Because of this payout structure, the token price directly reflects the market's collective belief about the probability of the outcome.
The implied probability equals the Yes token price:
P(Yes) = outcome_prices[0]
A Yes token trading at $0.65 means the market implies a 65% probability the event will occur. The No token would trade near $0.35, since the two must approximately sum to $1.00:
P(Yes) + P(No) ≈ 1.0
In practice the sum may slightly exceed 1.0 due to the bid-ask spread (the overround). This is stored as the implied_probability column in the market_probabilities table — a value between 0.0 and 1.0.
The scanner also records several price fields from the order book. These are useful for assessing market liquidity and execution quality.
Best Bid — the highest price a buyer is currently willing to pay:
best_bid = max price on the bid side of the order book
Best Ask — the lowest price a seller is currently willing to accept:
best_ask = min price on the ask side of the order book
Midpoint — the theoretical fair value, halfway between bid and ask:
midpoint = (best_bid + best_ask) / 2
Spread — the gap between best ask and best bid, representing the cost of immediacy:
spread = best_ask - best_bid
A tight spread (e.g. 0.01) indicates a liquid market. A wide spread (e.g. 0.10) means fewer participants and higher trading cost.
Last Trade Price — the price at which the most recent trade executed. May differ from the midpoint if the order book has moved since.
Consider a market: "Will the Fed cut rates in March?"
| Field | Value | Meaning |
|---|---|---|
outcome_prices |
["0.72", "0.28"] |
Yes=$0.72, No=$0.28 |
implied_probability |
0.72 |
Market implies 72% chance of a rate cut |
best_bid |
0.71 |
Best buyer willing to pay $0.71 |
best_ask |
0.73 |
Best seller asking $0.73 |
spread |
0.02 |
Tight spread = liquid market |
last_trade_price |
0.72 |
Last trade at $0.72 |
volume |
5000000.0 |
$5M total traded |
The market_probabilities table stores one snapshot per market per scan cycle, enabling time-series analysis:
- Rising
implied_probability— the market is becoming more confident the event will occur. - Falling
implied_probability— confidence is declining. - Widening
spread— liquidity is drying up; traders are less certain or less active. - Spike in
volume— a surge of trading activity, often triggered by new information.
event_scanner/
__init__.py
__main__.py # python -m entry point
config.py # Environment-based configuration
db.py # Schema creation and upsert logic
fetcher.py # Gamma API client with pagination
probability.py # Implied probability computation from outcome prices
run.py # Main loop with periodic scheduling
requirements.txt
.env.example