Fleet Telemetry Analysis Demo 🚜🧠📈

An end-to-end AI telemetry analysis demo: spec-driven scenario generation → deterministic telemetry synthesis → vectorization → evidence-based summarization → readonly dashboard.

Intent: This is not a production system. It is a runnable design exercise and interview-style prototype focused on AI reasoning, deterministic validation around LLM output, and modular architecture with swappable providers/storage boundaries.

✨ Overview

🧠 LLM scenario planning: reads docs/spec.md, requests JSON from an LLM, and retries with validation feedback when the response fails deterministic rules.
✅ Deterministic normalization/validation: strict contract checks enforce timeline, tractor assignment, sensor directives, and scenario policy before synthesis.
📈 Synthetic telemetry + events: generates 5-second telemetry aggregates and discrete events with reproducible inputs (scenario_start_utc + seed).
🗄️ Operational storage in TimescaleDB: persists scenario_runs, telemetry_5s, events, and summaries for auditability and downstream processing.
🔎 Vectorization to Qdrant: observation-only chunks are built from telemetry/events (no scenario titles/descriptions) and upserted with deterministic IDs.
📝 Evidence-based summarization: retrieves chunks from Qdrant, prompts an LLM with docs/summary_spec.md, validates strict JSON output, and writes summary rows.
🌐 Readonly web UI: displays scenario runs, tractor summaries, evidence metadata, and original scenario JSON for review.

✅ Key Features

Separate executables with clear boundaries:
- FleetTelemetry.Generator
- FleetTelemetry.Vectorizer
- FleetTelemetry.Summarizer
- FleetTelemetry.Web
Provider abstraction (FleetTelemetry.AI) so OpenAI can be swapped for another provider later (for example a local model service).
Run lifecycle auditing (started / succeeded / failed) with stage + error recording.
Deterministic replay support via persisted scenario_start_utc, seed, and attempt_count.
Raw SQL + Npgsql persistence (simple, explicit, testable).
Idempotent vectorizer point upserts and summarizer summary upserts for reruns.
Modular tests split by project area (Domain, Generator, Vectorizer, Summarizer) plus CI.

⚖️ Design Tradeoffs (Intentional)

LLM output is treated as untrusted input and must pass deterministic validation before any data generation occurs.
A wide-table telemetry schema is used first for simplicity and readability during prototyping.
Vectorization and summarization are separate processes from generation to keep concerns and operational boundaries clean.
No DI framework is used; composition is explicit in each Program.cs for clarity during review/interview walkthroughs.
The system is optimized for reasoning demo value, not max throughput or production-scale operability.

🚫 Non-goals

Production authentication/authorization or multi-tenant isolation.
Full tractor / J1939 fidelity or OEM-grade telemetry semantics.
Robust migration/versioning workflow for long-lived production databases.
Summary fidelity/reconstruction scoring against original scenarios (planned future feature).

🚀 Quickstart

Prereqs: Docker/Compose, Make, .NET 10 SDK.
Configure AI access before running the pipeline:
- AI_PROVIDER=openai
- AI_API_KEY=<your key>
- optional: AI_TEXT_MODEL, AI_EMBEDDING_MODEL, AI_BASE_URL
- if you are using the Docker targets, export these env vars in your shell so make can pass them into containers
Choose one pipeline path:
- local .NET path: make run-full-cycle
- Docker-only path: make containerized-run-full-cycle
Start infra + web:
- make infra-up
Build + test:
- make build
- make test
Run the full pipeline locally with the .NET SDK (generate → vectorize → summarize):
- make run-full-cycle
Run the full pipeline in containers with Docker only:
- make containerized-run-full-cycle
Refresh the dashboard:
- http://localhost:5000

🔗 Local Endpoints

Readonly dashboard (ASP.NET Core): http://localhost:5000
Qdrant API: http://localhost:6333
TimescaleDB/Postgres: localhost:5432 (use psql / make psql)

🛠 Useful Commands

Infra / logs:
- make infra-up
- make infra-down
- make infra-logs
Database:
- make psql
- make list-scenario-runs
Qdrant checks:
- make qdrant-ui
- make qdrant-collections
- make qdrant-info
- make qdrant-count
App runs (local):
- make run-generator
- make run-vectorizer SCENARIO_RUN_ID=<run-id>
- make run-summarizer SCENARIO_RUN_ID=<run-id>
- make run-web
- make run-full-cycle
App runs (containerized tools):
- make run-generator-docker
- make run-vectorizer-docker
- make run-summarizer-docker
- make containerized-run-full-cycle
Tests:
- make test
- make test-integration

🧰 Technologies

C# / .NET 10
ASP.NET Core (Razor Pages UI)
OpenAI .NET SDK (behind provider abstraction)
Npgsql
Postgres / TimescaleDB
Qdrant
xUnit + FluentAssertions
Docker Compose

🧭 Architecture

The architecture is intentionally split into small executables so each stage can be run, debugged, and replaced independently. The AI-related pieces are constrained by deterministic validators and explicit data contracts rather than implicit trust in model output.

1) Scenario generation path (Spec → Validated Plan → TimescaleDB)

flowchart LR
  spec["docs/spec.md"] --> gen["FleetTelemetry.Generator"]
  gen --> llm["AI text client"]
  llm --> retry["Retry w/ validation feedback"]
  retry --> normalize["Normalize + Validate"]
  normalize --> synth["Deterministic telemetry/event synthesis"]
  synth --> db[("TimescaleDB")]

2) Vectorization path (Observations → Embeddings → Qdrant)

flowchart LR
  db[("TimescaleDB telemetry/events")] --> vec["FleetTelemetry.Vectorizer"]
  vec --> chunk["Observation-only chunk builder"]
  chunk --> emb["AI embedding client"]
  emb --> qdrant[("Qdrant")]

3) Summarization path (Qdrant evidence → JSON summary → summaries table)

flowchart LR
  qdrant[("Qdrant evidence")] --> sum["FleetTelemetry.Summarizer"]
  sum --> sllm["AI text client"]
  sllm --> parse["Strict JSON parse/validate + retry"]
  parse --> summaries[("TimescaleDB summaries")]
  summaries --> web["FleetTelemetry.Web"]

⚙️ Configuration

Core

POSTGRES_CONNECTION_STRING
QDRANT_URL (default http://localhost:6333)
QDRANT_COLLECTION (default telemetry_chunks)

AI Provider

AI_PROVIDER (default openai)
AI_API_KEY
AI_BASE_URL (provider-specific; OpenAI defaults to https://api.openai.com/v1/)
AI_TEXT_MODEL (default gpt-5.2)
AI_EMBEDDING_MODEL (default text-embedding-3-small)

Legacy aliases are still accepted for compatibility (OPENAI_*, LLM_*, EMBEDDING_MODEL).

Generator / Scenario Validation Policy

ScenarioValidation:AllowedTractorIds (CSV)
ScenarioValidation:RequireCompleteAllowedTractorSet (true|false)
ScenarioValidation:RequireUniqueTractorsAcrossScenarios (true|false)
ScenarioValidation:RequireCatastrophicOutlier (true|false)

Vectorizer

SCENARIO_RUN_ID (optional filter)
CHUNK_WINDOW_SECONDS (default 300)
LOOKBACK_HOURS (optional)

Summarizer

SCENARIO_RUN_ID (required)
TRACTOR_ID (optional)
SUMMARY_WINDOW_SECONDS (default 3600)
TOP_K (default 20)
NOW_UTC (optional deterministic override)

📚 Docs

docs/spec.md — scenario-generation prompt contract and hard constraints
docs/summary_spec.md — summarizer JSON output contract
docs/sensor-bounds.md — sensor ranges used by deterministic synthesis
docs/adr/ — architecture decision records (optional/in-progress)

🤝 Contributing / Next Steps

Keep changes focused and document tradeoffs in PR notes/commit messages.
Add tests for behavior changes, especially contract validation and retry behavior.
Future ideas:
- local model provider implementation (for example Ollama)
- bounded-cost Qdrant retrieval strategy for larger collections
- post-summary reconstruction scoring (separate evaluator step)
- richer dashboard filtering and comparison views across scenario runs

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
db/init		db/init
docs		docs
src		src
tests		tests
.gitignore		.gitignore
Directory.Build.props		Directory.Build.props
Directory.Packages.props		Directory.Packages.props
FleetTelemetry.slnx		FleetTelemetry.slnx
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
global.json		global.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fleet Telemetry Analysis Demo 🚜🧠📈

📌 Table of Contents

✨ Overview

✅ Key Features

⚖️ Design Tradeoffs (Intentional)

🚫 Non-goals

🚀 Quickstart

🔗 Local Endpoints

🛠 Useful Commands

🧰 Technologies

🧭 Architecture

1) Scenario generation path (Spec → Validated Plan → TimescaleDB)

2) Vectorization path (Observations → Embeddings → Qdrant)

3) Summarization path (Qdrant evidence → JSON summary → summaries table)

⚙️ Configuration

Core

AI Provider

Generator / Scenario Validation Policy

Vectorizer

Summarizer

📚 Docs

🤝 Contributing / Next Steps

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Fleet Telemetry Analysis Demo 🚜🧠📈

📌 Table of Contents

✨ Overview

✅ Key Features

⚖️ Design Tradeoffs (Intentional)

🚫 Non-goals

🚀 Quickstart

🔗 Local Endpoints

🛠 Useful Commands

🧰 Technologies

🧭 Architecture

1) Scenario generation path (Spec → Validated Plan → TimescaleDB)

2) Vectorization path (Observations → Embeddings → Qdrant)

3) Summarization path (Qdrant evidence → JSON summary → summaries table)

⚙️ Configuration

Core

AI Provider

Generator / Scenario Validation Policy

Vectorizer

Summarizer

📚 Docs

🤝 Contributing / Next Steps

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages