xAI Hackathon 2025
CityVoice discovers and clusters citizen suggestions from social media, helping city officials understand what residents want and identify actionable improvements.
- AI-Powered Connection Discovery: Finds related ideas using Claude/GPT
- Interactive Graph Visualization: Force-directed graph with D3.js
- Automatic Clustering: Groups similar suggestions by topic and idea
- Actionable Summary: Prioritized list of citizen demands with consensus levels
- Export Reports: Generate markdown reports for stakeholders
- Grok Insights: The consolidated report now appends Grok-generated insights summarizing emerging themes and tensions
# Start the visualization server
python3 server.py
# Opens browser to http://localhost:7847The visualization includes:
- Left Panel: Interactive force-directed graph with zoom/pan
- Right Panel: Clusters, actionable items, and node details
- Legend: Topic-based color coding with counts
- Search: Filter nodes by keyword
- Export: Download consolidated markdown report
# Install dependencies
pip install anthropic
# Set API key
export ANTHROPIC_API_KEY="your-key-here"
# Run the analysis script
python find_related_items.py data/geodatanyc.csv
# Enhance clusters with AI-generated summaries (optional)
python enhance_clusters.pyThe script generates two output files in the same directory as the input:
-
connections.csv- Simple edge list with columns:source_id: ID of the source itemtarget_id: ID of the related itemreason: AI-generated explanation of the relationship
-
connections.json- Full graph data with:nodes: Array of all items with metadataedges: Array of all connections with reasons
# Use Anthropic (default)
python find_related_items.py data/geodatanyc.csv
# Use OpenAI instead
pip install openai
export OPENAI_API_KEY="your-key-here"
python find_related_items.py data/geodatanyc.csv --provider openai
# Lightweight keyword matching fallback (no API keys required)
python find_related_items.py data/geodatanyc.csv --provider keyword- Loads CSV with columns: Date, Username, Summary/Quote, Link
- For each item, asks AI to find the top 3-5 most related items
- AI considers: topic overlap, complementary/conflicting solutions, geographic focus
- Outputs connection data for graph visualization
- ≤100 items: Uses full context window (current approach)
- >100 items: Should use embeddings for efficiency (see
PLAN_EMBEDDINGS.md)
Input CSV should have these columns:
Date,Username,Summary/Quote,Link
2025-12-05,@NYCPlanning,ADUs are a proven way to create housing...,https://x.com/...
If neither Anthropic nor OpenAI credentials are available, the relation finder automatically falls back to a keyword-overlap heuristic so clustering can still run (albeit with simpler reasoning).
Use grok_x_search.py (requires pip install "xai-sdk>=1.3.1") to have Grok search X with the official x_search tool and dump detailed CSV rows that match the Date,Username,Summary/Quote,Link format.
# Set your xAI key (or store it in .env)
export XAI_API_KEY=sk-...
python grok_x_search.py \
--location "San Francisco transit" \
--count 12 \
--from-date 2024-06-01 \
--csv-out data/grok_sf_transit.csv \
--json-out data/grok_sf_transit.json--csv-outwrites exactly what Grok streams (header + rows) so you can drop it into the visualization pipeline.--json-outstores the assistant response, citations, and token usage for auditing.- Extra flags allow handle whitelists (
--allowed-handles), exclusions, or date filters to match civic research needs.