Inspiration One of the most sophisticated sanctions evasion operations in history is currently underway at sea. Hundreds of aging tankers, flying flags of convenience, turning off their AIS transponders, and conducting ship-to-ship transfers in international waters, now move sanctioned oil around the world. This "shadow fleet" is estimated to carry roughly 20% of global seaborne oil trade, yet it operates largely invisible to regulators and journalists. We wanted to change that. The question that drove Pelagos was simple: can a small team, in a hackathon weekend, build a system that makes these invisible ships visible in real time?
What We Built Pelagos is a maritime intelligence platform that combines live AIS tracking, satellite detection data, and a custom ML model to identify vessels likely operating as shadow fleet tankers, and then automatically generates a classified intelligence report for each one. The platform has three layers. The first is a live map with real-time vessel positions streamed via WebSocket from the AISStream API, overlaid on Mapbox GL with heat zones in areas where shadow fleet activity is known to concentrate. The second is an ML risk scoring system, a custom PU-Learning model that scores every vessel's probability of dark fleet membership based on its AIS turn-off and turn-on patterns, speed, geographic footprint, flag state, and vessel class. The third is a multi-agent investigation pipeline: when an analyst clicks "Investigate", a set of Claude-powered agents runs in parallel, one searching open-source intelligence, another checking sanction registries, a third analysing the vessel's historical dark periods, before a Reporter agent synthesises everything into a structured PDF intelligence report.
How We Built It The ML Model: PU-Learning on AIS Data The hardest problem was the label problem. There is no ground-truth database of confirmed shadow fleet ships. We know some confirmed vessels as positive examples, but the vast majority of the fleet is unlabeled, not necessarily clean, just unconfirmed. This is a classic Positive-Unlabeled (PU) Learning problem. The standard supervised approach fails because treating unlabeled examples as negative corrupts the signal. We implemented a PU Bagging classifier where, for each of B bags, a Random Forest is trained on the full positive set combined with a random subsample of unlabeled vessels used as pseudo-negatives. The final score is the ensemble mean across all bags. We engineered 16 features per vessel aggregated across all its AIS gap events, including the fraction of turn-offs inside monitored bounding boxes, proximity to known ship-to-ship transfer zones, the longest continuous AIS blackout, the fraction of events with no corresponding satellite detection as a spoofing signal, whether the vessel flies a flag of convenience, and how far vessels were from AIS shore stations when they went dark. Because no real labelled training set existed, we generated approximately 9,300 synthetic vessels with distributions calibrated to published research on shadow fleet behaviour. The Backend: FastAPI + Multi-Agent Pipeline The backend is a single FastAPI process that maintains a live WebSocket connection to AISStream, detects when vessels enter monitored hot zones, runs the PU-Learning model against local SQLite databases, orchestrates a parallel Claude agent pipeline for each investigation, and streams every status update back to the frontend in real time. The agent architecture uses asyncio.gather to run intelligence, sanctions, and historical-analysis agents simultaneously, then feeds their findings to a Reporter agent with a structured JSON schema. The Frontend: React + Mapbox GL The map renders three layers simultaneously: live vessel positions from the WebSocket, a historical dark event heat map based on SAR detections with no AIS match, and a per-vessel historical track from the GFW one-year path API. All API calls are environment-aware, switching between localhost in development and same-origin in production, so the same build works locally and on Railway.
Challenges The first major challenge was PU-Learning without real labels. Every ML tutorial assumes clean labels, but ours were fundamentally incomplete. Getting the PU Bagging implementation right, and calibrating synthetic data distributions so precision at 100 was meaningfully above random, took most of Day 1. The second was what we call the placeholder trap. Midway through the hackathon we discovered that our ml_model.py had quietly fallen back to a deterministic placeholder, a hash function on MMSI digits, that returned 65 to 70% for every single vessel. The real trained model existed, but the import path had silently broken. This is why end-to-end integration tests matter even in a hackathon. Third was data scarcity. The GFW API is the gold standard for vessel event data, but its historical coverage requires per-vessel API calls with rate limits. We pre-downloaded what we could into local SQLite databases and built a fallback chain from local DB to the GFW live API to a conservative default. Fourth was WebSocket and async agent streaming. Keeping the map live while an investigation runs in the background required careful asyncio task management. An investigation that crashes cannot kill the WebSocket broadcast loop, and a mode switch cannot leave orphaned tasks running. Fifth was production deployment. Vite bakes environment variables at build time, meaning the Mapbox token has to be injected as a Docker ARG before the build step, not as a runtime environment variable. Debugging this via Railway build logs while the healthcheck clock ticks is a particular kind of stress.
What We Learned PU-Learning is underused in open-source tooling but genuinely powerful when you have asymmetric label coverage. Multi-agent LLM pipelines feel like magic until latency hits, at which point you learn to parallelise everything. Real maritime data is messier than any dataset description suggests, with MMSI collisions, null flags, and impossible coordinates being the norm. And a live map that streams real vessel positions changes how people feel about a data problem, making the abstract concrete in a way that a table of probabilities never will.
Log in or sign up for Devpost to join the conversation.