🧠 NodeBench AI — Multi-Agent Document Intelligence System

Overview
NodeBench AI is a Notion-style editor backed by specialized AI agents. The Fast Agent Panel stitches together document editing, research, spreadsheet templating, media search, and SEC analysis without forcing users to bounce between apps.


🚀 Inspiration

We kept watching knowledge workers hop between docs, spreadsheets, search tabs, and chatbots. The bet behind NodeBench AI is that those workflows should live in one AI-native workspace where agents can understand context and take action immediately.


🧩 How We Built It

Frontend

  • React 19 + Vite + TypeScript, styled with Tailwind CSS utilities and tailwind-merge.
  • BlockNote/TipTap + EditorJS JSON for the rich-text editor, react-spreadsheet for tables, and React Flow for agent timelines and thinking graphs.
  • Storybook + Vitest + Testing Library to keep the UI stable.

Backend & Orchestration (Convex)

  • Convex hosts the real-time data model, document APIs, and the full agent runtime.
  • The Fast Agent Panel streams through convex/fastAgentPanelStreaming.ts using @convex-dev/agent and @convex-dev/persistent-text-streaming for low-latency updates.
  • convex/agents/specializedAgents.ts defines a Coordinator Agent that fans out to Document, Media, SEC, Web, and EntityResearch agents. Each tool call is Zod-validated and runs inside Convex actions/mutations.
  • Tooling lives under convex/tools (document operations, Linkup web search, SEC filings, media analyzers, spreadsheet parsers).
  • Entity research is cached in convex/entityContexts.ts so follow-up questions hit warm data instead of re-calling Linkup every time.

AI Stack

  • OpenAI GPT-5 (nano/mini variants) for coordination, planning, and reasoning.
  • Google Gemini 2.0 Flash via convex/genai.ts for structured data extraction, vision analysis, and doc rewriting when OpenAI isn’t ideal.
  • Vercel AI SDK wrappers plus the Model Context Protocol (MCP) bridge (convex/mcp.ts, convex/aiAgents.ts) to call third-party tools like Tavily search.
  • RAG indexing through @convex-dev/rag so document answers stay grounded.

Evaluation & Reliability

  • convex/tools/evaluation provides scripted scenarios, LLM-as-a-judge scoring, and quick regression runs (npm run eval:quick).
  • Streaming transcripts are logged as agent runs (agentRuns + agentRunEvents) for replay and debugging.

🧠 What We Learned

  • Structured tools tame hallucinations. Zod schemas and typed tool registries made argument drift obvious and recoverable.
  • Evaluation is a first-class feature. The LLM judge caught behavior regressions that unit tests and manual QA missed.
  • Explainability matters. Streaming thinking steps and React Flow visualizations help users (and us) trust multi-agent decisions.
  • Caching pays off. Entity research caching turned repeated knowledge requests from 10-second waits into instant answers.

⚙️ Challenges

  1. Coordinating delegation. Getting the coordinator to parallelize document, media, SEC, and research agents required tight tool contracts and aggressive “no-clarification” policies.
  2. Keeping streams consistent. We had to reconcile optimistic UI updates with server responses; PersistentTextStreaming plus run-event logging solved race conditions.
  3. Tool reliability. MCP integrations (Tavily, Linkup) can be flaky—so we built retries, fallback messaging, and evaluation coverage around them.
  4. Cost/latency balance. The Gemini + GPT dual path means every call needs routing heuristics and usage tracking (insertApiUsage) to stay inside budget.

🏁 Outcome

NodeBench AI now delivers a unified workspace where agents:

  • Fetch, summarize, and edit documents with grounded citations.
  • Spin up spreadsheet analyses through Convex actions and react-spreadsheet.
  • Run SEC filing searches and download flows end-to-end.
  • Execute cached company/person research with Linkup.
  • Stream reasoning steps directly into the Fast Agent Panel for a transparent UX.

On top of that, the evaluation harness and MCP bridge turned the project into a proving ground for future agent orchestration experiments.

Built With

  • ag-grid
  • blocknote
  • convex
  • convex-agent
  • google-gemini
  • linkup-sdk
  • openai-api
  • react
  • react-data-grid
  • tailwindcss
  • tiptap
  • typescript
  • vercel-ai-sdk
  • vite
Share this project:

Updates