Skip to content

RoboFinSystems/robosystems

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2,252 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

RoboSystems

RoboSystems is a financial intelligence platform that connects disparate data sources, builds domain ontologies as knowledge graphs, and provides AI-powered tools for accounting, financial reporting, investment management, and analysis. It powers RoboLedger and RoboInvestor.

  • LadybugDB Graph Database: Embedded columnar graph database with native DuckDB staging, LanceDB vector search, and tiered infrastructure
  • Extensions: Domain schemas that drive OLTP tables, API routes, data pipelines, and dedicated frontend apps. Extensions share a single database with schema-per-tenant isolation and materialize to the graph
  • Document Search: Full-text and semantic search across SEC filings, uploaded documents, and connected sources via OpenSearch
  • AI-Native Architecture: Context graphs with embeddings, semantic enrichment, and confidence scoring for LLM-powered analytics
  • Model Context Protocol (MCP): Standardized server and client for LLM integration with schema-aware tools
  • Multi-Source Data Integration: SEC XBRL filings, QuickBooks accounting data via dbt pipelines, and custom financial datasets
  • Enterprise-Ready Infrastructure: Multi-tenant architecture with tiered scaling and production-grade query management
  • Developer-First API: RESTful API designed for integration with financial applications

Platform

The platform provides the core infrastructure that all extensions build on:

  • Dedicated Infrastructure: Tiered graph infrastructure with dedicated instances and configurable memory allocation
  • AI Agent System: Autonomous financial operations — graph queries, taxonomy mapping, report generation — with automatic credit tracking and SSE progress streaming
  • Shared Repositories: SEC XBRL filings knowledge graph for context mining and benchmarking
  • Document Management: Upload, index, and search documents with full-text and semantic search via OpenSearch
  • DuckDB Staging System: High-performance data validation and bulk ingestion pipeline
  • Dagster Orchestration: Data pipeline orchestration for SEC filings, QuickBooks sync, backups, billing, and scheduled jobs
  • Credit-Based Billing: Flexible credits for AI operations based on token usage
  • Subgraphs (Workspaces): AI memory graphs and isolated environments for development and team collaboration

Extensions

Extensions are domain-specific subsystems that bring their own schema, OLTP tables, API routes, data pipelines, and dedicated frontend apps. They share a single PostgreSQL database with schema-per-tenant isolation and materialize to the graph for analytical queries. See Schema Extensions for the authoring contract.

The extensions API surface is graph-scoped at the URL levelgraph_id is always a path parameter, never a query argument — and splits reads from writes by transport:

  • Typed readsPOST /extensions/{graph_id}/graphql — Strawberry GraphQL endpoint with GraphiQL playground in dev. The schema is composed dynamically from enabled domains, so a ledger-only deployment exposes only ledger fields (no surprise runtime errors from disabled domains).
  • Command writesPOST /extensions/{roboledger|roboinvestor}/{graph_id}/operations/{operation_name} — named REST commands. Every command returns an OperationEnvelope with an op_<ULID> operation id, supports Idempotency-Key for safe retries, and is audit-logged. Long-running commands return status: "pending" and stream progress through /v1/operations/{operation_id}/stream.

Behind the API is a CQRS-style operations kernel (reads/ + commands/ per domain) that's the single source of truth for business logic. GraphQL resolvers, REST operation routes, and MCP tools all delegate to the same functions, so wire shapes stay byte-identical across consumers. Per-domain feature flags (ROBOLEDGER_ENABLED, ROBOINVESTOR_ENABLED) gate both the routers and the GraphQL schema composition.

See GraphQL Extensions for the read-path implementation details, the Strawberry-Pydantic auto-derivation pattern, and the walkthrough for adding a new read field.

Accounting and financial reporting extension. OLTP general ledger in schema-per-tenant PostgreSQL (accounts, transactions, journal entries, line items, dimensions); 29 GraphQL read fields covering entities, accounts, trial balance, fiscal calendar, schedules, taxonomies, mappings, reports, and publish lists; 23 named command operations for closing periods, creating schedules and closing entries, managing CoA→GAAP mapping associations, and authoring multi-period reports; analytical view operations over the materialized graph; QuickBooks ELT pipeline via dbt/Dagster; SEC XBRL financial reporting; AI-powered CoA→GAAP mapping via the MappingAgent. Dedicated frontend app.

Portfolio management and investment tracking extension. OLTP database with portfolios, securities, and positions in schema-per-tenant PostgreSQL; 7 GraphQL read fields (portfolios, securities, positions, holdings) and 9 named command operations for portfolio CRUD and position management. Securities can link to entities for cross-graph research between investor portfolios and SEC public-company data via the shared repository. Dedicated frontend app.

Quick Start

Docker Development Environment

# Install uv and just
brew install uv just

# Start robosystems backend api
just start

# Start frontend apps - robosystems-app, roboledger-app, roboinvestor-app
just start apps

This initializes the .env file and starts the complete RoboSystems stack with:

  • Graph API with LadybugDB and DuckDB backends
  • Dagster for data pipeline orchestration
  • PostgreSQL for graph metadata, IAM and Dagster
  • Valkey for caching, SSE messaging, and rate limiting
  • OpenSearch for full-text and semantic document search
  • Localstack for S3 and DynamoDB emulation

Service URLs:

Service URL
Main API http://localhost:8000
Graph API http://localhost:8001
Dagster UI http://localhost:8002

With just start apps (frontend apps):

App URL
RoboSystems App http://localhost:3000
RoboLedger App http://localhost:3001
RoboInvestor App http://localhost:3002

Local Development

# Setup Python environment (uv automatically handles Python versions)
just init

Examples

See RoboSystems in action with runnable demos that create graphs, load data, and execute queries with the robosystems-client:

just demo-sec               # Loads NVIDIA's SEC XBRL data via Dagster pipeline
just demo-close             # Entity accounting month close demo
just demo-custom-graph      # Builds custom graph schema with relationship networks

Each demo has a corresponding Wiki article with detailed guides.

Development Commands

Testing

just test-all               # Tests with code quality
just test                   # Default test suite
just test adapters          # Test specific module
just test-cov               # Tests with coverage

Log Monitoring

just logs api                 # View API logs (last 100 lines)
just logs graph-api           # View Graph API logs (last 100 lines)
just logs dagster-webserver   # View Dagster Webserver logs
just logs dagster-daemon      # View Dagster Daemon logs

See justfile for 50+ development commands including database migrations, CloudFormation linting, graph operations, administration, and more.

Prerequisites

System Requirements

  • Docker & Docker Compose
  • 8GB RAM minimum
  • 20GB free disk space

Required Tools

  • uv for Python package and version management
  • just for project command runner

Deployment Requirements

  • Fork this repo
  • AWS account with IAM Identity Center (SSO)
  • Run just bootstrap to configure OIDC and GitHub variables

See the Bootstrap Guide for complete instructions.

Architecture

RoboSystems is built on a modern, scalable architecture with:

Application Layer:

  • FastAPI REST API with versioned endpoints
  • Extension API routes feature-flagged per module
  • MCP Server for AI-powered graph database access with schema-aware tools
  • AI Agent System for autonomous financial operations with automatic credit tracking
  • Dagster for data pipeline orchestration and background jobs

LadybugDB Graph Database: (configuration)

  • Embedded columnar graph database purpose-built for financial analytics
  • Base + extension schema architecture — extensions define domain models
  • Native DuckDB integration for high-performance staging and ingestion
  • LanceDB vector search for semantic element resolution (IVF-PQ indexes, 384-dim embeddings)
  • Tiered infrastructure with configurable memory, rate limits, and subgraph allocations
  • Shared tier hosts public repositories with read replicas

Data Layer:

  • PostgreSQL for IAM, graph metadata, Dagster, and extension OLTP databases (schema-per-tenant)
  • OpenSearch for full-text and semantic document search (BM25 + KNN)
  • Valkey for caching, SSE messaging, and rate limiting
  • AWS S3 for data lake storage and static assets
  • DynamoDB for instance/graph/volume registry

Infrastructure:

  • ECS Fargate for API and Dagster
  • EC2 ASG for LadybugDB writer clusters
  • EC2 ALB + ASG for LadybugDB shared replica clusters
  • RDS PostgreSQL + ElastiCache Valkey
  • OpenSearch for full-text and semantic document search
  • CloudFormation infrastructure deployed via GitHub Actions with OIDC

For detailed architecture documentation, see the Architecture Overview in the Wiki.

SEC Shared Repository

A curated knowledge graph of US public company financial data from SEC EDGAR XBRL filings. Runs on the shared LadybugDB tier, accessible via MCP tools, Cypher queries, and the AI agent.

  • Pipeline: EDGAR → Download → Process (Parquet) → Stage (DuckDB) → Enrich (fastembed) → Materialize (LadybugDB) → Index + Embed (OpenSearch)
  • Graph: 14 node types and 24 relationship types modeling the full XBRL reporting hierarchy
  • Search: Hybrid BM25 + KNN vector search across XBRL text blocks, narrative sections, and iXBRL disclosures
  • Enrichment: Semantic element mapping, statement classification, and disclosure tagging via the Seattle Method taxonomy
just sec-load NVDA 2025  # Load NVIDIA filings for 2025
just sec-health          # Check SEC database health

See SEC Adapter and SEC Pipeline for detailed documentation.

AI

Model Context Protocol (MCP)

  • Financial Analysis: Natural language queries across enterprise data and public benchmark data
  • Cross-Database Queries: Compare user graph data against SEC shared repository data
  • Tools: Rich toolkit for graph queries, schema introspection, fact discovery, financial analysis, document search, and AI memory operations
  • Handler Pool: Managed MCP handler instances with resource limits

Agent System

  • Unified architecture: stateless agents with protocol-based service injection
  • Dual execution: API (sync/SSE) and background worker (Valkey queue + SSE progress)
  • Automatic credit tracking per AI call — agents cannot forget billing
  • Extensible: new agents implement run(ctx) and register with a decorator
  • See Agent README for details

Credit System

  • AI Operations Only: Credits are consumed exclusively by AI agent calls (Anthropic Claude via AWS Bedrock)
  • Token-Based Billing: Credits based on actual token usage and model cost
  • MCP Tool Access: No credits consumed for MCP calls or database operations

Client Libraries

RoboSystems provides comprehensive client libraries for building applications:

MCP (Model Context Protocol) Client

AI integration client for connecting Claude and other LLMs to RoboSystems.

npx -y @robosystems/mcp
  • Features: Claude Desktop integration, natural language queries, graph traversal, financial analysis
  • Use Cases: AI agents, chatbots, intelligent assistants, automated research
  • Documentation: npm | GitHub

TypeScript/JavaScript Client

Full-featured SDK for web and Node.js applications with TypeScript support.

npm install @robosystems/client
  • Features: Type-safe API calls, automatic retry logic, connection pooling, streaming support
  • Use Cases: Web applications, Node.js backends, React/Vue/Angular frontends
  • Documentation: npm | GitHub

Python Client

Native Python SDK for backend services and data science workflows.

pip install robosystems-client
  • Features: Async/await support, pandas integration, Jupyter compatibility, batch operations
  • Use Cases: Data pipelines, ML workflows, backend services, analytics
  • Documentation: PyPI | GitHub

Documentation

User Guides (Wiki)

Developer Documentation (Codebase)

Core Services:

  • Adapters - External service integrations
  • Operations - Business workflow orchestration, CQRS reads/commands kernels for extensions
  • Schemas - Graph schema definitions
  • Extensions GraphQL - Strawberry GraphQL read surface, Pydantic auto-derivation, resolver patterns
  • Configuration - Configuration management
  • Dagster - Data pipeline and task orchestration

Database Models:

  • Platform Models - SQLAlchemy models for the platform database (users, orgs, graphs, billing, connections, documents)
  • Extensions OLTP Models - SQLAlchemy models for the extensions database (roboledger ledger, roboinvestor portfolios) with schema-per-graph tenancy
  • API Models - Pydantic request/response models for core platform and extensions surfaces

Graph Database System:

Middleware Components:

Infrastructure:

Development Resources:

  • Examples - Runnable demos and integration examples
  • Tests - Testing strategy and organization
  • Admin Tools - Administrative utilities and cli

Security & Compliance:

  • SECURITY.md - Security features and compliance configuration

API Reference

Support

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

Apache-2.0 © 2026 RFS LLC

About

RoboSystems is a financial intelligence platform that unifies structured data, document search, and AI memory to transform complex financial data into actionable intelligence. Fork-ready with full GitHub Actions CI/CD for deploying CloudFormation infrastructure to your AWS account.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Contributors

Languages