Skip to content

pravin-python/AI-Radiology-Assistant

Repository files navigation

Python FastAPI LangChain Next.js Qdrant Docker

🩻 AI Radiology Assistant

A production-grade, AI-powered radiology chatbot using Retrieval-Augmented Generation (RAG) that helps radiology professionals, medical researchers, and students ask radiology and medical imaging questions — and receive accurate, evidence-based answers with verified scientific citations from trusted research sources.


Table of Contents

  1. Project Overview
  2. How the System Works
  3. How to Use the System
  4. Required Inputs
  5. Document Ingestion
  6. Data Sources
  7. Example Workflow
  8. System Requirements
  9. Quick Start
  10. API Reference
  11. Configuration
  12. Project Structure
  13. How to Improve the System
  14. Future Enhancements
  15. License

1 — Project Overview

What is AI Radiology Assistant?

AI Radiology Assistant is an intelligent question-answering system designed specifically for the radiology and medical imaging domain. It combines the power of Large Language Models (LLMs) with a Retrieval-Augmented Generation (RAG) pipeline to deliver accurate, context-rich answers backed by verified scientific literature.

Unlike standard chatbots that may fabricate references, this system employs an "LLM-first, verify-second" approach — every citation extracted from the AI response is cross-checked against PubMed and Semantic Scholar before being presented to the user.

Why RAG?

General-purpose LLMs are trained on broad datasets with a knowledge cut-off date. For medical professionals, this creates two critical problems:

Problem How RAG Solves It
Knowledge staleness RAG retrieves the latest research documents at query time, supplementing the LLM's parametric knowledge
Hallucinated citations RAG grounds the LLM's response in actual retrieved documents, and the citation validator verifies every reference
Lack of domain depth A curated radiology vector database provides specialized, high-quality knowledge that general LLMs may lack

Use Cases

  • 🏥 Radiology Question Answering — Ask about imaging modalities, diagnostic criteria, and radiological signs
  • 📖 Medical Research Assistance — Find relevant literature with verified citations (DOI, PMID)
  • 🎓 Medical Education — Students can explore radiology concepts with evidence-based explanations
  • 📋 Clinical Knowledge Lookup — Quick reference for differential diagnoses, imaging protocols, and clinical guidelines

Problems This System Solves

  • ❌ LLM hallucinations with fabricated medical references
  • ❌ Difficulty finding relevant radiology literature quickly
  • ❌ Lack of citation verification in AI-generated medical content
  • ❌ Information scattered across multiple databases and journals

2 — How the System Works

The AI Radiology Assistant follows a multi-step pipeline from user query to verified response:

┌─────────────────────────────────────────────────────────────────────┐
│                        SYSTEM WORKFLOW                              │
│                                                                     │
│   User Query                                                        │
│       │                                                             │
│       ▼                                                             │
│   LangChain Agent ──────► Tool Selection                            │
│       │                    (PubMed / Semantic Scholar /              │
│       │                     Tavily / Exa / Vector DB)               │
│       ▼                                                             │
│   Document Retrieval ───► Vector Search (Qdrant)                    │
│       │                                                             │
│       ▼                                                             │
│   Context Assembly ─────► Merge tool results + vector DB context    │
│       │                                                             │
│       ▼                                                             │
│   LLM Response Generation (Claude / Gemini / Groq)                  │
│       │                                                             │
│       ▼                                                             │
│   Citation Verification ► PubMed + Semantic Scholar cross-check     │
│       │                                                             │
│       ▼                                                             │
│   Final Response ────────► Streamed to user with verified citations │
└─────────────────────────────────────────────────────────────────────┘

Step-by-Step Breakdown

Step 1: User Query

The user submits a radiology-related question through the Next.js chat interface. The query is sent to the FastAPI backend via a POST /api/chat request.

Step 2: LangChain Agent

The RadiologyAgent receives the query and initializes with a system prompt that enforces evidence-based, citation-rich responses. The agent is equipped with multiple tools and decides which ones to invoke.

Step 3: Tool Selection

The agent intelligently selects the most appropriate tools based on the query type:

  • PubMed — for clinical evidence and peer-reviewed studies
  • Semantic Scholar — for academic papers and citation data
  • Tavily — for recent medical guidelines and protocols
  • Exa.ai — for deep research queries

Step 4: Document Retrieval (Vector Search)

The RadiologyRetriever queries the Qdrant vector database using cosine similarity to find the most relevant documents from the ingested knowledge base. Documents are embedded using Cohere or OpenAI embedding models.

Step 5: Context Assembly

Retrieved documents from the vector database and tool results are merged into a unified context. Each source is tagged with its origin (vector DB, PubMed, Semantic Scholar, etc.).

Step 6: LLM Response Generation

The assembled context and user query are sent to the configured LLM (Claude, Gemini, or Groq). The response is streamed back in real-time using Server-Sent Events (SSE).

Step 7: Citation Verification

The CitationValidator extracts all references from the LLM's response using regex patterns (DOI, PMID, bracketed references), then batch-verifies them:

  1. PubMed verification — checks PMID against the PubMed E-Utilities API
  2. Semantic Scholar DOI verification — validates DOI against the S2 Academic Graph API
  3. Semantic Scholar title verification — fuzzy-matches paper titles

All verifications run concurrently with a 14-second timeout budget.

Step 8: Final Response

The verified response with citations, sources, and metadata is streamed to the frontend, where it is rendered with Markdown formatting and clickable citation links.


3 — How to Use the System

Interacting with the AI Assistant

  1. Open the chat interface — Navigate to http://localhost:3000 in your browser
  2. Ask a radiology question — Type your question in the input field, or click one of the suggestion chips
  3. The system retrieves knowledge — The agent searches multiple sources and the vector database
  4. The LLM generates an answer — A streaming response appears in real-time
  5. Verified citations are shown — Each reference is displayed with title, authors, journal, year, DOI/PMID, and a verification badge

Example Questions

Category Example Question
Diagnostic Imaging "What are the CT findings of pulmonary embolism?"
MRI Interpretation "Explain MRI features of multiple sclerosis"
Differential Diagnosis "What are common causes of ground-glass opacities?"
Comparative Analysis "What are the key differences between CT and MRI for brain imaging?"
Classification Systems "Explain the BI-RADS classification system in mammography"
Staging & Oncology "What is the role of PET-CT in oncology staging?"
Plain Radiography "Describe common chest X-ray findings in pneumonia"

4 — Required Inputs

User Inputs

The system accepts natural language queries related to:

  • Radiology questions — imaging findings, modalities, diagnostic criteria
  • Medical imaging queries — CT, MRI, X-ray, ultrasound, PET, SPECT
  • Research questions — latest guidelines, evidence comparisons, clinical protocols

Queries must be between 1 and 2,000 characters in length.

System Inputs

For the RAG pipeline to function effectively, the system requires:

Input Type Description Examples
Knowledge Base Radiology documents ingested into Qdrant Textbook chapters, research papers, guidelines
API Keys Authentication for LLMs and search tools Groq, Cohere, Tavily, Exa.ai API keys
Configuration System behavior settings LLM provider, embedding model, retrieval parameters

How Inputs Are Processed

  1. User query is embedded using the configured embedding model (Cohere / OpenAI)
  2. Cosine similarity search finds the top-k most relevant documents (default: 5, threshold: 0.45)
  3. Tool results are fetched from external APIs based on agent decisions
  4. All inputs are assembled into a structured context prompt for the LLM

5 — Document Ingestion

Supported Document Types

Format Use Case
PDF Research papers, radiology textbooks, clinical reports
Plain Text Extracted content, notes, guidelines
HTML Medical articles, web-based resources
Markdown Documentation, structured medical content

Ingestion Pipeline

Upload Document → Extract Text → Chunk Content → Generate Embeddings → Store in Qdrant

Step 1: Upload Document

Documents can be ingested via:

  • The CLI ingestion script (scripts/ingest_data.py)
  • The REST API (POST /api/ingest)

Step 2: Extract Text

Raw text is extracted from the document. For PDFs, text extraction happens during pre-processing.

Step 3: Chunk the Content

Text is split into overlapping chunks for optimal retrieval:

  • Max chunk size: 800 characters
  • Overlap: 100 characters
  • Splitting strategy: Paragraph-based chunking with character-count limits

Step 4: Generate Embeddings

Each chunk is converted into a dense vector using the configured embedding model:

  • Cohere embed-english-v3.0 (1024 dimensions) — default
  • OpenAI embedding models — alternative

An in-memory LRU cache (up to 2,048 entries) avoids re-embedding identical text.

Step 5: Store in Qdrant

Vectors are upserted into the Qdrant collection with rich metadata:

{
  "text": "chunk content...",
  "source": "Radiology Journal 2024",
  "publication": "Paper Title",
  "metadata": { ... }
}

CLI Usage

# Ingest a single file
python scripts/ingest_data.py --file path/to/paper.pdf --source "Radiology Journal"

# Ingest a directory of documents
python scripts/ingest_data.py --dir path/to/docs/

# Ingest inline text
python scripts/ingest_data.py --text "CT imaging shows..." --source "Manual Entry"

API Usage

curl -X POST http://localhost:8000/api/ingest \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Your medical document text...",
    "source": "Radiology Textbook",
    "publication": "Fundamentals of Radiology",
    "metadata": {"chapter": "3", "topic": "MRI"}
  }'

6 — Data Sources

The system combines multiple high-quality data sources to ensure comprehensive and reliable answers:

External Research APIs

Source Purpose API Key Required
🏥 PubMed Peer-reviewed medical literature NCBI E-Utilities No (≤3 req/s)
📚 Semantic Scholar Academic papers + citation data S2 Academic Graph No (≤100 req/5 min)
🌐 Tavily Recent guidelines & protocols Tavily Search API Yes
🔬 Exa.ai Deep research-grade search Exa API Yes

Internal Knowledge Base

Source Storage Description
📄 Qdrant Vector DB radiology_docs collection Ingested radiology documents, research papers, textbooks, and clinical guidelines

Why These Sources Matter

  • PubMed is the gold standard for biomedical literature — it provides peer-reviewed, MEDLINE-indexed articles with DOI and PMID identifiers
  • Semantic Scholar adds citation context, helping assess a paper's impact and reliability
  • Tavily and Exa fill gaps with recent clinical guidelines and up-to-date protocols that may not yet be indexed in PubMed
  • The internal vector database provides instant, low-latency retrieval of curated domain-specific knowledge

7 — Example Workflow

Scenario

User asks: "What are the radiological signs of pneumothorax?"

System Processing

1. Vector Database Search

The query is embedded and compared against the radiology_docs collection in Qdrant. The retriever returns the top-5 most similar document chunks (cosine similarity ≥ 0.45):

[Vector DB Result 1] "Pneumothorax appears as a visceral pleural line..."
[Vector DB Result 2] "On upright chest X-ray, pneumothorax is seen as..."

2. Agent Tool Calls

The LangChain agent decides to invoke PubMed and Semantic Scholar for additional evidence:

  • PubMed search: "radiological signs pneumothorax imaging" → returns 5 peer-reviewed articles
  • Semantic Scholar search: "pneumothorax radiology" → returns 5 academic papers with citation data

3. Context Assembly

All results are merged into a structured context:

  • 5 vector DB chunks
  • 5 PubMed articles (title, abstract, DOI, PMID)
  • 5 Semantic Scholar papers (title, authors, citation count)

4. LLM Response Generation

The LLM generates a comprehensive answer using the assembled context:

"Pneumothorax can be identified on imaging through several key radiological signs:

1. Visceral pleural line: A thin white line visible on chest X-ray... 2. Absent lung markings: Beyond the visceral pleural line... 3. Deep sulcus sign: On supine radiographs...

References: [1] Smith et al. "Imaging of Pneumothorax." Radiology (2023). DOI: 10.1148/... [2] Johnson et al. "Emergency Chest Imaging." AJR (2022). PMID: 35912847"

5. Citation Verification

The CitationValidator extracts references [1] and [2]:

  • [1] DOI 10.1148/... → verified via Semantic Scholar ✅
  • [2] PMID 35912847 → verified via PubMed ✅

The final response is streamed to the user with verified badges on each citation.


8 — System Requirements

Backend

Component Requirement
Language Python 3.11+
Framework FastAPI
AI Framework LangChain
Validation Pydantic v2 / pydantic-settings
HTTP Client httpx (async)
Logging structlog (JSON in production, colored console in development)

Frontend

Component Requirement
Framework Next.js 16
UI Library React 19
Language TypeScript
Styling Tailwind CSS v4
Markdown react-markdown
Theme next-themes (dark/light mode)

Vector Database

Component Requirement
Database Qdrant
Protocol gRPC / HTTP (port 6333)
Distance Metric Cosine Similarity
Embedding Dimension 1024 (configurable)

LLM Providers (choose one)

Provider Model Key Required
Anthropic Claude Sonnet Yes
Google Gemini 2.0 Flash Yes
Groq Llama 3.3 70B Versatile Yes (default)

Embedding Providers (choose one)

Provider Model Dimension
Cohere (default) embed-english-v3.0 1024
OpenAI Configurable Configurable

Infrastructure

Component Requirement
Container Runtime Docker & Docker Compose
Node.js 18+ (20 recommended)
RAM 4 GB minimum

9 — Quick Start

Prerequisites

  • Python 3.11+
  • Node.js 18+
  • Docker & Docker Compose

1. Clone & Configure

git clone https://github.com/pravin-python/AI-Radiology-Assistant.git
cd AI-Radiology-Assistant

# Windows
copy .env.example .env

# Linux / macOS
cp .env.example .env

# Edit .env and add your API keys

2. Run with Docker (Recommended)

docker compose -f docker/docker-compose.yml up --build

This starts:

  • Qdrant on http://localhost:6333
  • Backend on http://localhost:8000 (API docs at /docs)
  • Frontend on http://localhost:3000

3. Run Locally (Development)

Backend:

cd backend
pip install -r ../requirements.txt
uvicorn app.main:app --reload --port 8000

Frontend:

cd frontend
npm install
npm run dev

Qdrant (via Docker):

docker run -p 6333:6333 qdrant/qdrant

10 — API Reference

Method Endpoint Description
POST /api/chat Send a query and receive an AI response (SSE stream or JSON)
POST /api/ingest Upload documents for embedding into the vector store
GET /api/health Service health check (Qdrant status, LLM provider)
GET /api/sources Return collection metadata and availability

POST /api/chat

Request Body:

{
  "query": "What are the CT findings of pulmonary embolism?",
  "conversation_id": "optional-uuid",
  "stream": true,
  "history": []
}

Response (SSE stream):

data: {"event": "token", "data": "Pulmonary"}
data: {"event": "token", "data": " embolism"}
data: {"event": "sources", "data": "[...]"}
data: {"event": "citations", "data": "[...]"}
data: {"event": "done", "data": ""}

POST /api/ingest

Request Body:

{
  "text": "Document content...",
  "source": "Radiology Journal",
  "publication": "Paper Title",
  "metadata": {}
}

Response:

{
  "document_id": "abc123...",
  "chunks_stored": 12,
  "collection": "radiology_docs"
}

11 — Configuration

All configuration is managed via environment variables. See .env.example for the full list:

Variable Description Default
DEFAULT_LLM_PROVIDER LLM backend: anthropic, google, groq groq
ANTHROPIC_API_KEY Anthropic (Claude) API key
GOOGLE_API_KEY Google (Gemini) API key
GROQ_API_KEY Groq API key
EMBEDDING_PROVIDER Embedding model: cohere, openai cohere
COHERE_API_KEY Cohere embedding API key
EMBEDDING_MODEL Embedding model name embed-english-v3.0
EMBEDDING_DIMENSION Vector dimension 1024
QDRANT_HOST Qdrant server host localhost
QDRANT_PORT Qdrant server port 6333
QDRANT_COLLECTION Qdrant collection name radiology_docs
TAVILY_API_KEY Tavily search API key
EXA_API_KEY Exa.ai search API key
APP_ENV Environment: development, staging, production development
LOG_LEVEL Logging level INFO

12 — Project Structure

AI-Radiology-Assistant/
├── backend/
│   └── app/
│       ├── main.py                     # FastAPI entry point
│       ├── config/
│       │   └── settings.py             # Pydantic settings (env vars)
│       ├── api/
│       │   └── routes.py               # REST endpoints (/chat, /ingest, /health, /sources)
│       ├── agents/
│       │   └── radiology_agent.py      # LangChain agent with tool calling
│       ├── rag/
│       │   ├── embeddings.py           # Cohere/OpenAI embedding service with cache
│       │   ├── retriever.py            # LangChain-compatible async retriever
│       │   ├── vector_store.py         # Qdrant async wrapper (upsert, search, health)
│       │   └── pipeline.py             # Full RAG pipeline orchestrator
│       ├── tools/
│       │   ├── pubmed_tool.py          # PubMed NCBI E-Utilities search
│       │   ├── semantic_scholar_tool.py # Semantic Scholar Academic Graph search
│       │   ├── tavily_tool.py          # Tavily web search for medical context
│       │   └── exa_tool.py             # Exa.ai research-grade search
│       ├── services/
│       │   └── citation_validator.py   # LLM-first, verify-second citation checker
│       ├── models/
│       │   └── schemas.py              # Pydantic request/response models
│       └── utils/
│           └── logger.py               # structlog configuration
├── frontend/
│   └── src/
│       ├── app/
│       │   ├── layout.tsx              # Root layout with theme support
│       │   └── page.tsx                # Main page
│       ├── components/
│       │   ├── ChatUI.tsx              # Chat interface with streaming
│       │   ├── MessageBubble.tsx        # Message rendering with Markdown + citations
│       │   ├── Header.tsx              # App header with dark mode toggle
│       │   └── Providers.tsx           # Theme provider wrapper
│       └── lib/
│           └── api.ts                  # API client with SSE streaming support
├── docker/
│   ├── Dockerfile.backend              # Multi-stage Python backend image
│   ├── Dockerfile.frontend             # Multi-stage Next.js frontend image
│   └── docker-compose.yml              # Full-stack orchestration
├── scripts/
│   └── ingest_data.py                  # CLI document ingestion tool
├── .github/
│   └── workflows/
│       └── ci.yml                      # CI pipeline (lint, build, Docker)
├── requirements.txt                    # Python dependencies
├── .env.example                        # Environment variable template
├── .editorconfig                       # Editor formatting rules
├── .pre-commit-config.yaml             # Git pre-commit hooks
└── LICENSE                             # MIT License

13 — How to Improve the System

Improvement Description Impact
📊 Better Radiology Datasets Ingest comprehensive radiology textbooks (e.g., Grainger & Allison, Brant & Helms) and curated open-access repositories (RadioGraphics, RSNA Case Collection) Higher retrieval quality and more accurate answers
🧠 Fine-Tuned Medical LLMs Use LLMs specifically fine-tuned on medical/radiology data (e.g., Med-PaLM, BioMistral) instead of general-purpose models More accurate medical terminology and reasoning
🔀 Hybrid Search Combine vector similarity search with BM25 keyword search for better retrieval recall Fewer missed relevant documents
📈 Improved Retrieval Ranking Implement re-ranking using cross-encoder models (e.g., Cohere Rerank) to order retrieved documents by relevance More contextually relevant responses
🖼️ Image-Based Analysis Add support for uploading radiology images (X-ray, CT slices) and using vision LLMs (GPT-4V, Gemini Vision) for analysis Multi-modal radiology assistance
🕸️ Knowledge Graph Integration Build a medical knowledge graph connecting diseases, imaging findings, modalities, and differential diagnoses Structured reasoning and relationship discovery
Caching & Performance Add Redis caching for frequent queries, implement semantic caching for similar questions Faster response times and lower API costs
🔍 Citation Pipeline Improvements Add CrossRef API for DOI resolution, support ORCID author verification, include impact factor data Higher citation reliability and richer metadata
🌍 Multi-Language Support Add medical query translation and multi-language response generation Broader accessibility for international users

14 — Future Enhancements

Feature Description
🏗️ DICOM Image Analysis Direct analysis of DICOM medical images with AI-powered findings detection
🔗 PACS Integration Connect to hospital Picture Archiving and Communication Systems for seamless clinical workflow
🎙️ Voice Interface Voice input and text-to-speech output for hands-free operation during clinical work
🏥 Clinical Decision Support Integrate with clinical workflows to provide differential diagnoses and suggest imaging protocols
👤 Personalized Assistant Learn from user preferences, specialization area, and interaction history for tailored responses
🤖 Multi-Agent System Specialized agents for different radiology subspecialties (neuroradiology, musculoskeletal, cardiothoracic) working collaboratively
📊 Analytics Dashboard Usage analytics, query patterns, and knowledge gap identification for continuous improvement
🔒 HIPAA Compliance Enterprise-grade security features for deployment in clinical environments

15 — Performance Targets

Metric Target
Citation verification < 15 seconds
Vector retrieval latency < 2 seconds
External API calls Fully async with configurable timeouts
Embedding operations Batch processing with LRU cache (2,048 entries)
Agent tool iterations Maximum 5 per query

🛠 Tech Stack

Backend: Python · FastAPI · LangChain · Pydantic · AsyncIO · Qdrant · structlog Frontend: Next.js · React · TypeScript · Tailwind CSS · react-markdown AI/LLM: Claude · Gemini · Groq Embeddings: Cohere · OpenAI Search APIs: PubMed · Semantic Scholar · Tavily · Exa.ai Infrastructure: Docker · Docker Compose · GitHub Actions CI


📝 License

This project is licensed under the MIT License. See the LICENSE file for details.


Built with ❤️ for the radiology community

About

Advanced RAG-based AI Radiology Assistant using LangChain, FastAPI, & Next.js. Delivers evidence-based medical answers with verified PubMed/S2 citations.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors