🩻 AI Radiology Assistant

A production-grade, AI-powered radiology chatbot using Retrieval-Augmented Generation (RAG) that helps radiology professionals, medical researchers, and students ask radiology and medical imaging questions — and receive accurate, evidence-based answers with verified scientific citations from trusted research sources.

1 — Project Overview

What is AI Radiology Assistant?

AI Radiology Assistant is an intelligent question-answering system designed specifically for the radiology and medical imaging domain. It combines the power of Large Language Models (LLMs) with a Retrieval-Augmented Generation (RAG) pipeline to deliver accurate, context-rich answers backed by verified scientific literature.

Unlike standard chatbots that may fabricate references, this system employs an "LLM-first, verify-second" approach — every citation extracted from the AI response is cross-checked against PubMed and Semantic Scholar before being presented to the user.

Why RAG?

General-purpose LLMs are trained on broad datasets with a knowledge cut-off date. For medical professionals, this creates two critical problems:

Problem	How RAG Solves It
Knowledge staleness	RAG retrieves the latest research documents at query time, supplementing the LLM's parametric knowledge
Hallucinated citations	RAG grounds the LLM's response in actual retrieved documents, and the citation validator verifies every reference
Lack of domain depth	A curated radiology vector database provides specialized, high-quality knowledge that general LLMs may lack

Use Cases

🏥 Radiology Question Answering — Ask about imaging modalities, diagnostic criteria, and radiological signs
📖 Medical Research Assistance — Find relevant literature with verified citations (DOI, PMID)
🎓 Medical Education — Students can explore radiology concepts with evidence-based explanations
📋 Clinical Knowledge Lookup — Quick reference for differential diagnoses, imaging protocols, and clinical guidelines

Problems This System Solves

❌ LLM hallucinations with fabricated medical references
❌ Difficulty finding relevant radiology literature quickly
❌ Lack of citation verification in AI-generated medical content
❌ Information scattered across multiple databases and journals

2 — How the System Works

The AI Radiology Assistant follows a multi-step pipeline from user query to verified response:

┌─────────────────────────────────────────────────────────────────────┐
│                        SYSTEM WORKFLOW                              │
│                                                                     │
│   User Query                                                        │
│       │                                                             │
│       ▼                                                             │
│   LangChain Agent ──────► Tool Selection                            │
│       │                    (PubMed / Semantic Scholar /              │
│       │                     Tavily / Exa / Vector DB)               │
│       ▼                                                             │
│   Document Retrieval ───► Vector Search (Qdrant)                    │
│       │                                                             │
│       ▼                                                             │
│   Context Assembly ─────► Merge tool results + vector DB context    │
│       │                                                             │
│       ▼                                                             │
│   LLM Response Generation (Claude / Gemini / Groq)                  │
│       │                                                             │
│       ▼                                                             │
│   Citation Verification ► PubMed + Semantic Scholar cross-check     │
│       │                                                             │
│       ▼                                                             │
│   Final Response ────────► Streamed to user with verified citations │
└─────────────────────────────────────────────────────────────────────┘

Step-by-Step Breakdown

Step 1: User Query

The user submits a radiology-related question through the Next.js chat interface. The query is sent to the FastAPI backend via a POST /api/chat request.

Step 2: LangChain Agent

The RadiologyAgent receives the query and initializes with a system prompt that enforces evidence-based, citation-rich responses. The agent is equipped with multiple tools and decides which ones to invoke.

Step 3: Tool Selection

The agent intelligently selects the most appropriate tools based on the query type:

PubMed — for clinical evidence and peer-reviewed studies
Semantic Scholar — for academic papers and citation data
Tavily — for recent medical guidelines and protocols
Exa.ai — for deep research queries

Step 4: Document Retrieval (Vector Search)

The RadiologyRetriever queries the Qdrant vector database using cosine similarity to find the most relevant documents from the ingested knowledge base. Documents are embedded using Cohere or OpenAI embedding models.

Step 5: Context Assembly

Retrieved documents from the vector database and tool results are merged into a unified context. Each source is tagged with its origin (vector DB, PubMed, Semantic Scholar, etc.).

Step 6: LLM Response Generation

The assembled context and user query are sent to the configured LLM (Claude, Gemini, or Groq). The response is streamed back in real-time using Server-Sent Events (SSE).

Step 7: Citation Verification

The CitationValidator extracts all references from the LLM's response using regex patterns (DOI, PMID, bracketed references), then batch-verifies them:

PubMed verification — checks PMID against the PubMed E-Utilities API
Semantic Scholar DOI verification — validates DOI against the S2 Academic Graph API
Semantic Scholar title verification — fuzzy-matches paper titles

All verifications run concurrently with a 14-second timeout budget.

Step 8: Final Response

The verified response with citations, sources, and metadata is streamed to the frontend, where it is rendered with Markdown formatting and clickable citation links.

3 — How to Use the System

Interacting with the AI Assistant

Open the chat interface — Navigate to http://localhost:3000 in your browser
Ask a radiology question — Type your question in the input field, or click one of the suggestion chips
The system retrieves knowledge — The agent searches multiple sources and the vector database
The LLM generates an answer — A streaming response appears in real-time
Verified citations are shown — Each reference is displayed with title, authors, journal, year, DOI/PMID, and a verification badge

Example Questions

Category	Example Question
Diagnostic Imaging	"What are the CT findings of pulmonary embolism?"
MRI Interpretation	"Explain MRI features of multiple sclerosis"
Differential Diagnosis	"What are common causes of ground-glass opacities?"
Comparative Analysis	"What are the key differences between CT and MRI for brain imaging?"
Classification Systems	"Explain the BI-RADS classification system in mammography"
Staging & Oncology	"What is the role of PET-CT in oncology staging?"
Plain Radiography	"Describe common chest X-ray findings in pneumonia"

4 — Required Inputs

User Inputs

The system accepts natural language queries related to:

Radiology questions — imaging findings, modalities, diagnostic criteria
Medical imaging queries — CT, MRI, X-ray, ultrasound, PET, SPECT
Research questions — latest guidelines, evidence comparisons, clinical protocols

Queries must be between 1 and 2,000 characters in length.

System Inputs

For the RAG pipeline to function effectively, the system requires:

Input Type	Description	Examples
Knowledge Base	Radiology documents ingested into Qdrant	Textbook chapters, research papers, guidelines
API Keys	Authentication for LLMs and search tools	Groq, Cohere, Tavily, Exa.ai API keys
Configuration	System behavior settings	LLM provider, embedding model, retrieval parameters

How Inputs Are Processed

User query is embedded using the configured embedding model (Cohere / OpenAI)
Cosine similarity search finds the top-k most relevant documents (default: 5, threshold: 0.45)
Tool results are fetched from external APIs based on agent decisions
All inputs are assembled into a structured context prompt for the LLM

5 — Document Ingestion

Supported Document Types

Format	Use Case
PDF	Research papers, radiology textbooks, clinical reports
Plain Text	Extracted content, notes, guidelines
HTML	Medical articles, web-based resources
Markdown	Documentation, structured medical content

Ingestion Pipeline

Upload Document → Extract Text → Chunk Content → Generate Embeddings → Store in Qdrant

Step 1: Upload Document

Documents can be ingested via:

The CLI ingestion script (scripts/ingest_data.py)
The REST API (POST /api/ingest)

Step 2: Extract Text

Raw text is extracted from the document. For PDFs, text extraction happens during pre-processing.

Step 3: Chunk the Content

Text is split into overlapping chunks for optimal retrieval:

Max chunk size: 800 characters
Overlap: 100 characters
Splitting strategy: Paragraph-based chunking with character-count limits

Step 4: Generate Embeddings

Each chunk is converted into a dense vector using the configured embedding model:

Cohere embed-english-v3.0 (1024 dimensions) — default
OpenAI embedding models — alternative

An in-memory LRU cache (up to 2,048 entries) avoids re-embedding identical text.

Step 5: Store in Qdrant

Vectors are upserted into the Qdrant collection with rich metadata:

{
  "text": "chunk content...",
  "source": "Radiology Journal 2024",
  "publication": "Paper Title",
  "metadata": { ... }
}

CLI Usage

# Ingest a single file
python scripts/ingest_data.py --file path/to/paper.pdf --source "Radiology Journal"

# Ingest a directory of documents
python scripts/ingest_data.py --dir path/to/docs/

# Ingest inline text
python scripts/ingest_data.py --text "CT imaging shows..." --source "Manual Entry"

API Usage

curl -X POST http://localhost:8000/api/ingest \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Your medical document text...",
    "source": "Radiology Textbook",
    "publication": "Fundamentals of Radiology",
    "metadata": {"chapter": "3", "topic": "MRI"}
  }'

6 — Data Sources

The system combines multiple high-quality data sources to ensure comprehensive and reliable answers:

External Research APIs

Source	Purpose	API	Key Required
🏥 PubMed	Peer-reviewed medical literature	NCBI E-Utilities	No (≤3 req/s)
📚 Semantic Scholar	Academic papers + citation data	S2 Academic Graph	No (≤100 req/5 min)
🌐 Tavily	Recent guidelines & protocols	Tavily Search API	Yes
🔬 Exa.ai	Deep research-grade search	Exa API	Yes

Internal Knowledge Base

Source	Storage	Description
📄 Qdrant Vector DB	`radiology_docs` collection	Ingested radiology documents, research papers, textbooks, and clinical guidelines

Why These Sources Matter

PubMed is the gold standard for biomedical literature — it provides peer-reviewed, MEDLINE-indexed articles with DOI and PMID identifiers
Semantic Scholar adds citation context, helping assess a paper's impact and reliability
Tavily and Exa fill gaps with recent clinical guidelines and up-to-date protocols that may not yet be indexed in PubMed
The internal vector database provides instant, low-latency retrieval of curated domain-specific knowledge

7 — Example Workflow

Scenario

User asks: "What are the radiological signs of pneumothorax?"

System Processing

1. Vector Database Search

The query is embedded and compared against the radiology_docs collection in Qdrant. The retriever returns the top-5 most similar document chunks (cosine similarity ≥ 0.45):

[Vector DB Result 1] "Pneumothorax appears as a visceral pleural line..."
[Vector DB Result 2] "On upright chest X-ray, pneumothorax is seen as..."

2. Agent Tool Calls

The LangChain agent decides to invoke PubMed and Semantic Scholar for additional evidence:

PubMed search: "radiological signs pneumothorax imaging" → returns 5 peer-reviewed articles
Semantic Scholar search: "pneumothorax radiology" → returns 5 academic papers with citation data

3. Context Assembly

All results are merged into a structured context:

5 vector DB chunks
5 PubMed articles (title, abstract, DOI, PMID)
5 Semantic Scholar papers (title, authors, citation count)

4. LLM Response Generation

The LLM generates a comprehensive answer using the assembled context:

"Pneumothorax can be identified on imaging through several key radiological signs:

1. Visceral pleural line: A thin white line visible on chest X-ray... 2. Absent lung markings: Beyond the visceral pleural line... 3. Deep sulcus sign: On supine radiographs...

References: [1] Smith et al. "Imaging of Pneumothorax." Radiology (2023). DOI: 10.1148/... [2] Johnson et al. "Emergency Chest Imaging." AJR (2022). PMID: 35912847"

5. Citation Verification

The CitationValidator extracts references [1] and [2]:

[1] DOI 10.1148/... → verified via Semantic Scholar ✅
[2] PMID 35912847 → verified via PubMed ✅

The final response is streamed to the user with verified badges on each citation.

8 — System Requirements

Backend

Component	Requirement
Language	Python 3.11+
Framework	FastAPI
AI Framework	LangChain
Validation	Pydantic v2 / pydantic-settings
HTTP Client	httpx (async)
Logging	structlog (JSON in production, colored console in development)

Frontend

Component	Requirement
Framework	Next.js 16
UI Library	React 19
Language	TypeScript
Styling	Tailwind CSS v4
Markdown	react-markdown
Theme	next-themes (dark/light mode)

Vector Database

Component	Requirement
Database	Qdrant
Protocol	gRPC / HTTP (port 6333)
Distance Metric	Cosine Similarity
Embedding Dimension	1024 (configurable)

LLM Providers (choose one)

Provider	Model	Key Required
Anthropic	Claude Sonnet	Yes
Google	Gemini 2.0 Flash	Yes
Groq	Llama 3.3 70B Versatile	Yes (default)

Embedding Providers (choose one)

Provider	Model	Dimension
Cohere (default)	embed-english-v3.0	1024
OpenAI	Configurable	Configurable

Infrastructure

Component	Requirement
Container Runtime	Docker & Docker Compose
Node.js	18+ (20 recommended)
RAM	4 GB minimum

9 — Quick Start

Prerequisites

Python 3.11+
Node.js 18+
Docker & Docker Compose

1. Clone & Configure

git clone https://github.com/pravin-python/AI-Radiology-Assistant.git
cd AI-Radiology-Assistant

# Windows
copy .env.example .env

# Linux / macOS
cp .env.example .env

# Edit .env and add your API keys

2. Run with Docker (Recommended)

docker compose -f docker/docker-compose.yml up --build

This starts:

Qdrant on http://localhost:6333
Backend on http://localhost:8000 (API docs at /docs)
Frontend on http://localhost:3000

3. Run Locally (Development)

Backend:

cd backend
pip install -r ../requirements.txt
uvicorn app.main:app --reload --port 8000

Frontend:

cd frontend
npm install
npm run dev

Qdrant (via Docker):

docker run -p 6333:6333 qdrant/qdrant

10 — API Reference

Method	Endpoint	Description
`POST`	`/api/chat`	Send a query and receive an AI response (SSE stream or JSON)
`POST`	`/api/ingest`	Upload documents for embedding into the vector store
`GET`	`/api/health`	Service health check (Qdrant status, LLM provider)
`GET`	`/api/sources`	Return collection metadata and availability

POST `/api/chat`

Request Body:

{
  "query": "What are the CT findings of pulmonary embolism?",
  "conversation_id": "optional-uuid",
  "stream": true,
  "history": []
}

Response (SSE stream):

data: {"event": "token", "data": "Pulmonary"}
data: {"event": "token", "data": " embolism"}
data: {"event": "sources", "data": "[...]"}
data: {"event": "citations", "data": "[...]"}
data: {"event": "done", "data": ""}

POST `/api/ingest`

Request Body:

{
  "text": "Document content...",
  "source": "Radiology Journal",
  "publication": "Paper Title",
  "metadata": {}
}

Response:

{
  "document_id": "abc123...",
  "chunks_stored": 12,
  "collection": "radiology_docs"
}

11 — Configuration

All configuration is managed via environment variables. See .env.example for the full list:

Variable	Description	Default
`DEFAULT_LLM_PROVIDER`	LLM backend: `anthropic`, `google`, `groq`	`groq`
`ANTHROPIC_API_KEY`	Anthropic (Claude) API key	—
`GOOGLE_API_KEY`	Google (Gemini) API key	—
`GROQ_API_KEY`	Groq API key	—
`EMBEDDING_PROVIDER`	Embedding model: `cohere`, `openai`	`cohere`
`COHERE_API_KEY`	Cohere embedding API key	—
`EMBEDDING_MODEL`	Embedding model name	`embed-english-v3.0`
`EMBEDDING_DIMENSION`	Vector dimension	`1024`
`QDRANT_HOST`	Qdrant server host	`localhost`
`QDRANT_PORT`	Qdrant server port	`6333`
`QDRANT_COLLECTION`	Qdrant collection name	`radiology_docs`
`TAVILY_API_KEY`	Tavily search API key	—
`EXA_API_KEY`	Exa.ai search API key	—
`APP_ENV`	Environment: `development`, `staging`, `production`	`development`
`LOG_LEVEL`	Logging level	`INFO`

12 — Project Structure

AI-Radiology-Assistant/
├── backend/
│   └── app/
│       ├── main.py                     # FastAPI entry point
│       ├── config/
│       │   └── settings.py             # Pydantic settings (env vars)
│       ├── api/
│       │   └── routes.py               # REST endpoints (/chat, /ingest, /health, /sources)
│       ├── agents/
│       │   └── radiology_agent.py      # LangChain agent with tool calling
│       ├── rag/
│       │   ├── embeddings.py           # Cohere/OpenAI embedding service with cache
│       │   ├── retriever.py            # LangChain-compatible async retriever
│       │   ├── vector_store.py         # Qdrant async wrapper (upsert, search, health)
│       │   └── pipeline.py             # Full RAG pipeline orchestrator
│       ├── tools/
│       │   ├── pubmed_tool.py          # PubMed NCBI E-Utilities search
│       │   ├── semantic_scholar_tool.py # Semantic Scholar Academic Graph search
│       │   ├── tavily_tool.py          # Tavily web search for medical context
│       │   └── exa_tool.py             # Exa.ai research-grade search
│       ├── services/
│       │   └── citation_validator.py   # LLM-first, verify-second citation checker
│       ├── models/
│       │   └── schemas.py              # Pydantic request/response models
│       └── utils/
│           └── logger.py               # structlog configuration
├── frontend/
│   └── src/
│       ├── app/
│       │   ├── layout.tsx              # Root layout with theme support
│       │   └── page.tsx                # Main page
│       ├── components/
│       │   ├── ChatUI.tsx              # Chat interface with streaming
│       │   ├── MessageBubble.tsx        # Message rendering with Markdown + citations
│       │   ├── Header.tsx              # App header with dark mode toggle
│       │   └── Providers.tsx           # Theme provider wrapper
│       └── lib/
│           └── api.ts                  # API client with SSE streaming support
├── docker/
│   ├── Dockerfile.backend              # Multi-stage Python backend image
│   ├── Dockerfile.frontend             # Multi-stage Next.js frontend image
│   └── docker-compose.yml              # Full-stack orchestration
├── scripts/
│   └── ingest_data.py                  # CLI document ingestion tool
├── .github/
│   └── workflows/
│       └── ci.yml                      # CI pipeline (lint, build, Docker)
├── requirements.txt                    # Python dependencies
├── .env.example                        # Environment variable template
├── .editorconfig                       # Editor formatting rules
├── .pre-commit-config.yaml             # Git pre-commit hooks
└── LICENSE                             # MIT License

13 — How to Improve the System

Improvement	Description	Impact
📊 Better Radiology Datasets	Ingest comprehensive radiology textbooks (e.g., Grainger & Allison, Brant & Helms) and curated open-access repositories (RadioGraphics, RSNA Case Collection)	Higher retrieval quality and more accurate answers
🧠 Fine-Tuned Medical LLMs	Use LLMs specifically fine-tuned on medical/radiology data (e.g., Med-PaLM, BioMistral) instead of general-purpose models	More accurate medical terminology and reasoning
🔀 Hybrid Search	Combine vector similarity search with BM25 keyword search for better retrieval recall	Fewer missed relevant documents
📈 Improved Retrieval Ranking	Implement re-ranking using cross-encoder models (e.g., Cohere Rerank) to order retrieved documents by relevance	More contextually relevant responses
🖼️ Image-Based Analysis	Add support for uploading radiology images (X-ray, CT slices) and using vision LLMs (GPT-4V, Gemini Vision) for analysis	Multi-modal radiology assistance
🕸️ Knowledge Graph Integration	Build a medical knowledge graph connecting diseases, imaging findings, modalities, and differential diagnoses	Structured reasoning and relationship discovery
⚡ Caching & Performance	Add Redis caching for frequent queries, implement semantic caching for similar questions	Faster response times and lower API costs
🔍 Citation Pipeline Improvements	Add CrossRef API for DOI resolution, support ORCID author verification, include impact factor data	Higher citation reliability and richer metadata
🌍 Multi-Language Support	Add medical query translation and multi-language response generation	Broader accessibility for international users

14 — Future Enhancements

Feature	Description
🏗️ DICOM Image Analysis	Direct analysis of DICOM medical images with AI-powered findings detection
🔗 PACS Integration	Connect to hospital Picture Archiving and Communication Systems for seamless clinical workflow
🎙️ Voice Interface	Voice input and text-to-speech output for hands-free operation during clinical work
🏥 Clinical Decision Support	Integrate with clinical workflows to provide differential diagnoses and suggest imaging protocols
👤 Personalized Assistant	Learn from user preferences, specialization area, and interaction history for tailored responses
🤖 Multi-Agent System	Specialized agents for different radiology subspecialties (neuroradiology, musculoskeletal, cardiothoracic) working collaboratively
📊 Analytics Dashboard	Usage analytics, query patterns, and knowledge gap identification for continuous improvement
🔒 HIPAA Compliance	Enterprise-grade security features for deployment in clinical environments

15 — Performance Targets

Metric	Target
Citation verification	< 15 seconds
Vector retrieval latency	< 2 seconds
External API calls	Fully async with configurable timeouts
Embedding operations	Batch processing with LRU cache (2,048 entries)
Agent tool iterations	Maximum 5 per query

🛠 Tech Stack

Backend: Python · FastAPI · LangChain · Pydantic · AsyncIO · Qdrant · structlog Frontend: Next.js · React · TypeScript · Tailwind CSS · react-markdown AI/LLM: Claude · Gemini · Groq Embeddings: Cohere · OpenAI Search APIs: PubMed · Semantic Scholar · Tavily · Exa.ai Infrastructure: Docker · Docker Compose · GitHub Actions CI

📝 License

This project is licensed under the MIT License. See the LICENSE file for details.

Built with ❤️ for the radiology community

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
backend/app		backend/app
docker		docker
frontend		frontend
scripts		scripts
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🩻 AI Radiology Assistant

Table of Contents

1 — Project Overview

What is AI Radiology Assistant?

Why RAG?

Use Cases

Problems This System Solves

2 — How the System Works

Step-by-Step Breakdown

Step 1: User Query

Step 2: LangChain Agent

Step 3: Tool Selection

Step 4: Document Retrieval (Vector Search)

Step 5: Context Assembly

Step 6: LLM Response Generation

Step 7: Citation Verification

Step 8: Final Response

3 — How to Use the System

Interacting with the AI Assistant

Example Questions

4 — Required Inputs

User Inputs

System Inputs

How Inputs Are Processed

5 — Document Ingestion

Supported Document Types

Ingestion Pipeline

Step 1: Upload Document

Step 2: Extract Text

Step 3: Chunk the Content

Step 4: Generate Embeddings

Step 5: Store in Qdrant

CLI Usage

API Usage

6 — Data Sources

External Research APIs

Internal Knowledge Base

Why These Sources Matter

7 — Example Workflow

Scenario

System Processing

1. Vector Database Search

2. Agent Tool Calls

3. Context Assembly

4. LLM Response Generation

5. Citation Verification

8 — System Requirements

Backend

Frontend

Vector Database

LLM Providers (choose one)

Embedding Providers (choose one)

Infrastructure

9 — Quick Start

Prerequisites

1. Clone & Configure

2. Run with Docker (Recommended)

3. Run Locally (Development)

10 — API Reference

POST /api/chat

POST /api/ingest

11 — Configuration

12 — Project Structure

13 — How to Improve the System

14 — Future Enhancements

15 — Performance Targets

🛠 Tech Stack

📝 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

POST `/api/chat`

POST `/api/ingest`

Packages