A production-grade, AI-powered radiology chatbot using Retrieval-Augmented Generation (RAG) that helps radiology professionals, medical researchers, and students ask radiology and medical imaging questions — and receive accurate, evidence-based answers with verified scientific citations from trusted research sources.
- Project Overview
- How the System Works
- How to Use the System
- Required Inputs
- Document Ingestion
- Data Sources
- Example Workflow
- System Requirements
- Quick Start
- API Reference
- Configuration
- Project Structure
- How to Improve the System
- Future Enhancements
- License
AI Radiology Assistant is an intelligent question-answering system designed specifically for the radiology and medical imaging domain. It combines the power of Large Language Models (LLMs) with a Retrieval-Augmented Generation (RAG) pipeline to deliver accurate, context-rich answers backed by verified scientific literature.
Unlike standard chatbots that may fabricate references, this system employs an "LLM-first, verify-second" approach — every citation extracted from the AI response is cross-checked against PubMed and Semantic Scholar before being presented to the user.
General-purpose LLMs are trained on broad datasets with a knowledge cut-off date. For medical professionals, this creates two critical problems:
| Problem | How RAG Solves It |
|---|---|
| Knowledge staleness | RAG retrieves the latest research documents at query time, supplementing the LLM's parametric knowledge |
| Hallucinated citations | RAG grounds the LLM's response in actual retrieved documents, and the citation validator verifies every reference |
| Lack of domain depth | A curated radiology vector database provides specialized, high-quality knowledge that general LLMs may lack |
- 🏥 Radiology Question Answering — Ask about imaging modalities, diagnostic criteria, and radiological signs
- 📖 Medical Research Assistance — Find relevant literature with verified citations (DOI, PMID)
- 🎓 Medical Education — Students can explore radiology concepts with evidence-based explanations
- 📋 Clinical Knowledge Lookup — Quick reference for differential diagnoses, imaging protocols, and clinical guidelines
- ❌ LLM hallucinations with fabricated medical references
- ❌ Difficulty finding relevant radiology literature quickly
- ❌ Lack of citation verification in AI-generated medical content
- ❌ Information scattered across multiple databases and journals
The AI Radiology Assistant follows a multi-step pipeline from user query to verified response:
┌─────────────────────────────────────────────────────────────────────┐
│ SYSTEM WORKFLOW │
│ │
│ User Query │
│ │ │
│ ▼ │
│ LangChain Agent ──────► Tool Selection │
│ │ (PubMed / Semantic Scholar / │
│ │ Tavily / Exa / Vector DB) │
│ ▼ │
│ Document Retrieval ───► Vector Search (Qdrant) │
│ │ │
│ ▼ │
│ Context Assembly ─────► Merge tool results + vector DB context │
│ │ │
│ ▼ │
│ LLM Response Generation (Claude / Gemini / Groq) │
│ │ │
│ ▼ │
│ Citation Verification ► PubMed + Semantic Scholar cross-check │
│ │ │
│ ▼ │
│ Final Response ────────► Streamed to user with verified citations │
└─────────────────────────────────────────────────────────────────────┘
The user submits a radiology-related question through the Next.js chat interface. The query is sent to the FastAPI backend via a POST /api/chat request.
The RadiologyAgent receives the query and initializes with a system prompt that enforces evidence-based, citation-rich responses. The agent is equipped with multiple tools and decides which ones to invoke.
The agent intelligently selects the most appropriate tools based on the query type:
- PubMed — for clinical evidence and peer-reviewed studies
- Semantic Scholar — for academic papers and citation data
- Tavily — for recent medical guidelines and protocols
- Exa.ai — for deep research queries
The RadiologyRetriever queries the Qdrant vector database using cosine similarity to find the most relevant documents from the ingested knowledge base. Documents are embedded using Cohere or OpenAI embedding models.
Retrieved documents from the vector database and tool results are merged into a unified context. Each source is tagged with its origin (vector DB, PubMed, Semantic Scholar, etc.).
The assembled context and user query are sent to the configured LLM (Claude, Gemini, or Groq). The response is streamed back in real-time using Server-Sent Events (SSE).
The CitationValidator extracts all references from the LLM's response using regex patterns (DOI, PMID, bracketed references), then batch-verifies them:
- PubMed verification — checks PMID against the PubMed E-Utilities API
- Semantic Scholar DOI verification — validates DOI against the S2 Academic Graph API
- Semantic Scholar title verification — fuzzy-matches paper titles
All verifications run concurrently with a 14-second timeout budget.
The verified response with citations, sources, and metadata is streamed to the frontend, where it is rendered with Markdown formatting and clickable citation links.
- Open the chat interface — Navigate to
http://localhost:3000in your browser - Ask a radiology question — Type your question in the input field, or click one of the suggestion chips
- The system retrieves knowledge — The agent searches multiple sources and the vector database
- The LLM generates an answer — A streaming response appears in real-time
- Verified citations are shown — Each reference is displayed with title, authors, journal, year, DOI/PMID, and a verification badge
| Category | Example Question |
|---|---|
| Diagnostic Imaging | "What are the CT findings of pulmonary embolism?" |
| MRI Interpretation | "Explain MRI features of multiple sclerosis" |
| Differential Diagnosis | "What are common causes of ground-glass opacities?" |
| Comparative Analysis | "What are the key differences between CT and MRI for brain imaging?" |
| Classification Systems | "Explain the BI-RADS classification system in mammography" |
| Staging & Oncology | "What is the role of PET-CT in oncology staging?" |
| Plain Radiography | "Describe common chest X-ray findings in pneumonia" |
The system accepts natural language queries related to:
- Radiology questions — imaging findings, modalities, diagnostic criteria
- Medical imaging queries — CT, MRI, X-ray, ultrasound, PET, SPECT
- Research questions — latest guidelines, evidence comparisons, clinical protocols
Queries must be between 1 and 2,000 characters in length.
For the RAG pipeline to function effectively, the system requires:
| Input Type | Description | Examples |
|---|---|---|
| Knowledge Base | Radiology documents ingested into Qdrant | Textbook chapters, research papers, guidelines |
| API Keys | Authentication for LLMs and search tools | Groq, Cohere, Tavily, Exa.ai API keys |
| Configuration | System behavior settings | LLM provider, embedding model, retrieval parameters |
- User query is embedded using the configured embedding model (Cohere / OpenAI)
- Cosine similarity search finds the top-k most relevant documents (default: 5, threshold: 0.45)
- Tool results are fetched from external APIs based on agent decisions
- All inputs are assembled into a structured context prompt for the LLM
| Format | Use Case |
|---|---|
| Research papers, radiology textbooks, clinical reports | |
| Plain Text | Extracted content, notes, guidelines |
| HTML | Medical articles, web-based resources |
| Markdown | Documentation, structured medical content |
Upload Document → Extract Text → Chunk Content → Generate Embeddings → Store in Qdrant
Documents can be ingested via:
- The CLI ingestion script (
scripts/ingest_data.py) - The REST API (
POST /api/ingest)
Raw text is extracted from the document. For PDFs, text extraction happens during pre-processing.
Text is split into overlapping chunks for optimal retrieval:
- Max chunk size: 800 characters
- Overlap: 100 characters
- Splitting strategy: Paragraph-based chunking with character-count limits
Each chunk is converted into a dense vector using the configured embedding model:
- Cohere
embed-english-v3.0(1024 dimensions) — default - OpenAI embedding models — alternative
An in-memory LRU cache (up to 2,048 entries) avoids re-embedding identical text.
Vectors are upserted into the Qdrant collection with rich metadata:
{
"text": "chunk content...",
"source": "Radiology Journal 2024",
"publication": "Paper Title",
"metadata": { ... }
}# Ingest a single file
python scripts/ingest_data.py --file path/to/paper.pdf --source "Radiology Journal"
# Ingest a directory of documents
python scripts/ingest_data.py --dir path/to/docs/
# Ingest inline text
python scripts/ingest_data.py --text "CT imaging shows..." --source "Manual Entry"curl -X POST http://localhost:8000/api/ingest \
-H "Content-Type: application/json" \
-d '{
"text": "Your medical document text...",
"source": "Radiology Textbook",
"publication": "Fundamentals of Radiology",
"metadata": {"chapter": "3", "topic": "MRI"}
}'The system combines multiple high-quality data sources to ensure comprehensive and reliable answers:
| Source | Purpose | API | Key Required |
|---|---|---|---|
| 🏥 PubMed | Peer-reviewed medical literature | NCBI E-Utilities | No (≤3 req/s) |
| 📚 Semantic Scholar | Academic papers + citation data | S2 Academic Graph | No (≤100 req/5 min) |
| 🌐 Tavily | Recent guidelines & protocols | Tavily Search API | Yes |
| 🔬 Exa.ai | Deep research-grade search | Exa API | Yes |
| Source | Storage | Description |
|---|---|---|
| 📄 Qdrant Vector DB | radiology_docs collection |
Ingested radiology documents, research papers, textbooks, and clinical guidelines |
- PubMed is the gold standard for biomedical literature — it provides peer-reviewed, MEDLINE-indexed articles with DOI and PMID identifiers
- Semantic Scholar adds citation context, helping assess a paper's impact and reliability
- Tavily and Exa fill gaps with recent clinical guidelines and up-to-date protocols that may not yet be indexed in PubMed
- The internal vector database provides instant, low-latency retrieval of curated domain-specific knowledge
User asks: "What are the radiological signs of pneumothorax?"
The query is embedded and compared against the radiology_docs collection in Qdrant. The retriever returns the top-5 most similar document chunks (cosine similarity ≥ 0.45):
[Vector DB Result 1] "Pneumothorax appears as a visceral pleural line..."
[Vector DB Result 2] "On upright chest X-ray, pneumothorax is seen as..."
The LangChain agent decides to invoke PubMed and Semantic Scholar for additional evidence:
- PubMed search:
"radiological signs pneumothorax imaging"→ returns 5 peer-reviewed articles - Semantic Scholar search:
"pneumothorax radiology"→ returns 5 academic papers with citation data
All results are merged into a structured context:
- 5 vector DB chunks
- 5 PubMed articles (title, abstract, DOI, PMID)
- 5 Semantic Scholar papers (title, authors, citation count)
The LLM generates a comprehensive answer using the assembled context:
"Pneumothorax can be identified on imaging through several key radiological signs:
1. Visceral pleural line: A thin white line visible on chest X-ray... 2. Absent lung markings: Beyond the visceral pleural line... 3. Deep sulcus sign: On supine radiographs...
References: [1] Smith et al. "Imaging of Pneumothorax." Radiology (2023). DOI: 10.1148/... [2] Johnson et al. "Emergency Chest Imaging." AJR (2022). PMID: 35912847"
The CitationValidator extracts references [1] and [2]:
- [1] DOI
10.1148/...→ verified via Semantic Scholar ✅ - [2] PMID
35912847→ verified via PubMed ✅
The final response is streamed to the user with verified badges on each citation.
| Component | Requirement |
|---|---|
| Language | Python 3.11+ |
| Framework | FastAPI |
| AI Framework | LangChain |
| Validation | Pydantic v2 / pydantic-settings |
| HTTP Client | httpx (async) |
| Logging | structlog (JSON in production, colored console in development) |
| Component | Requirement |
|---|---|
| Framework | Next.js 16 |
| UI Library | React 19 |
| Language | TypeScript |
| Styling | Tailwind CSS v4 |
| Markdown | react-markdown |
| Theme | next-themes (dark/light mode) |
| Component | Requirement |
|---|---|
| Database | Qdrant |
| Protocol | gRPC / HTTP (port 6333) |
| Distance Metric | Cosine Similarity |
| Embedding Dimension | 1024 (configurable) |
| Provider | Model | Key Required |
|---|---|---|
| Anthropic | Claude Sonnet | Yes |
| Gemini 2.0 Flash | Yes | |
| Groq | Llama 3.3 70B Versatile | Yes (default) |
| Provider | Model | Dimension |
|---|---|---|
| Cohere (default) | embed-english-v3.0 | 1024 |
| OpenAI | Configurable | Configurable |
| Component | Requirement |
|---|---|
| Container Runtime | Docker & Docker Compose |
| Node.js | 18+ (20 recommended) |
| RAM | 4 GB minimum |
- Python 3.11+
- Node.js 18+
- Docker & Docker Compose
git clone https://github.com/pravin-python/AI-Radiology-Assistant.git
cd AI-Radiology-Assistant
# Windows
copy .env.example .env
# Linux / macOS
cp .env.example .env
# Edit .env and add your API keysdocker compose -f docker/docker-compose.yml up --buildThis starts:
- Qdrant on
http://localhost:6333 - Backend on
http://localhost:8000(API docs at/docs) - Frontend on
http://localhost:3000
Backend:
cd backend
pip install -r ../requirements.txt
uvicorn app.main:app --reload --port 8000Frontend:
cd frontend
npm install
npm run devQdrant (via Docker):
docker run -p 6333:6333 qdrant/qdrant| Method | Endpoint | Description |
|---|---|---|
POST |
/api/chat |
Send a query and receive an AI response (SSE stream or JSON) |
POST |
/api/ingest |
Upload documents for embedding into the vector store |
GET |
/api/health |
Service health check (Qdrant status, LLM provider) |
GET |
/api/sources |
Return collection metadata and availability |
Request Body:
{
"query": "What are the CT findings of pulmonary embolism?",
"conversation_id": "optional-uuid",
"stream": true,
"history": []
}Response (SSE stream):
data: {"event": "token", "data": "Pulmonary"}
data: {"event": "token", "data": " embolism"}
data: {"event": "sources", "data": "[...]"}
data: {"event": "citations", "data": "[...]"}
data: {"event": "done", "data": ""}
Request Body:
{
"text": "Document content...",
"source": "Radiology Journal",
"publication": "Paper Title",
"metadata": {}
}Response:
{
"document_id": "abc123...",
"chunks_stored": 12,
"collection": "radiology_docs"
}All configuration is managed via environment variables. See .env.example for the full list:
| Variable | Description | Default |
|---|---|---|
DEFAULT_LLM_PROVIDER |
LLM backend: anthropic, google, groq |
groq |
ANTHROPIC_API_KEY |
Anthropic (Claude) API key | — |
GOOGLE_API_KEY |
Google (Gemini) API key | — |
GROQ_API_KEY |
Groq API key | — |
EMBEDDING_PROVIDER |
Embedding model: cohere, openai |
cohere |
COHERE_API_KEY |
Cohere embedding API key | — |
EMBEDDING_MODEL |
Embedding model name | embed-english-v3.0 |
EMBEDDING_DIMENSION |
Vector dimension | 1024 |
QDRANT_HOST |
Qdrant server host | localhost |
QDRANT_PORT |
Qdrant server port | 6333 |
QDRANT_COLLECTION |
Qdrant collection name | radiology_docs |
TAVILY_API_KEY |
Tavily search API key | — |
EXA_API_KEY |
Exa.ai search API key | — |
APP_ENV |
Environment: development, staging, production |
development |
LOG_LEVEL |
Logging level | INFO |
AI-Radiology-Assistant/
├── backend/
│ └── app/
│ ├── main.py # FastAPI entry point
│ ├── config/
│ │ └── settings.py # Pydantic settings (env vars)
│ ├── api/
│ │ └── routes.py # REST endpoints (/chat, /ingest, /health, /sources)
│ ├── agents/
│ │ └── radiology_agent.py # LangChain agent with tool calling
│ ├── rag/
│ │ ├── embeddings.py # Cohere/OpenAI embedding service with cache
│ │ ├── retriever.py # LangChain-compatible async retriever
│ │ ├── vector_store.py # Qdrant async wrapper (upsert, search, health)
│ │ └── pipeline.py # Full RAG pipeline orchestrator
│ ├── tools/
│ │ ├── pubmed_tool.py # PubMed NCBI E-Utilities search
│ │ ├── semantic_scholar_tool.py # Semantic Scholar Academic Graph search
│ │ ├── tavily_tool.py # Tavily web search for medical context
│ │ └── exa_tool.py # Exa.ai research-grade search
│ ├── services/
│ │ └── citation_validator.py # LLM-first, verify-second citation checker
│ ├── models/
│ │ └── schemas.py # Pydantic request/response models
│ └── utils/
│ └── logger.py # structlog configuration
├── frontend/
│ └── src/
│ ├── app/
│ │ ├── layout.tsx # Root layout with theme support
│ │ └── page.tsx # Main page
│ ├── components/
│ │ ├── ChatUI.tsx # Chat interface with streaming
│ │ ├── MessageBubble.tsx # Message rendering with Markdown + citations
│ │ ├── Header.tsx # App header with dark mode toggle
│ │ └── Providers.tsx # Theme provider wrapper
│ └── lib/
│ └── api.ts # API client with SSE streaming support
├── docker/
│ ├── Dockerfile.backend # Multi-stage Python backend image
│ ├── Dockerfile.frontend # Multi-stage Next.js frontend image
│ └── docker-compose.yml # Full-stack orchestration
├── scripts/
│ └── ingest_data.py # CLI document ingestion tool
├── .github/
│ └── workflows/
│ └── ci.yml # CI pipeline (lint, build, Docker)
├── requirements.txt # Python dependencies
├── .env.example # Environment variable template
├── .editorconfig # Editor formatting rules
├── .pre-commit-config.yaml # Git pre-commit hooks
└── LICENSE # MIT License
| Improvement | Description | Impact |
|---|---|---|
| 📊 Better Radiology Datasets | Ingest comprehensive radiology textbooks (e.g., Grainger & Allison, Brant & Helms) and curated open-access repositories (RadioGraphics, RSNA Case Collection) | Higher retrieval quality and more accurate answers |
| 🧠 Fine-Tuned Medical LLMs | Use LLMs specifically fine-tuned on medical/radiology data (e.g., Med-PaLM, BioMistral) instead of general-purpose models | More accurate medical terminology and reasoning |
| 🔀 Hybrid Search | Combine vector similarity search with BM25 keyword search for better retrieval recall | Fewer missed relevant documents |
| 📈 Improved Retrieval Ranking | Implement re-ranking using cross-encoder models (e.g., Cohere Rerank) to order retrieved documents by relevance | More contextually relevant responses |
| 🖼️ Image-Based Analysis | Add support for uploading radiology images (X-ray, CT slices) and using vision LLMs (GPT-4V, Gemini Vision) for analysis | Multi-modal radiology assistance |
| 🕸️ Knowledge Graph Integration | Build a medical knowledge graph connecting diseases, imaging findings, modalities, and differential diagnoses | Structured reasoning and relationship discovery |
| ⚡ Caching & Performance | Add Redis caching for frequent queries, implement semantic caching for similar questions | Faster response times and lower API costs |
| 🔍 Citation Pipeline Improvements | Add CrossRef API for DOI resolution, support ORCID author verification, include impact factor data | Higher citation reliability and richer metadata |
| 🌍 Multi-Language Support | Add medical query translation and multi-language response generation | Broader accessibility for international users |
| Feature | Description |
|---|---|
| 🏗️ DICOM Image Analysis | Direct analysis of DICOM medical images with AI-powered findings detection |
| 🔗 PACS Integration | Connect to hospital Picture Archiving and Communication Systems for seamless clinical workflow |
| 🎙️ Voice Interface | Voice input and text-to-speech output for hands-free operation during clinical work |
| 🏥 Clinical Decision Support | Integrate with clinical workflows to provide differential diagnoses and suggest imaging protocols |
| 👤 Personalized Assistant | Learn from user preferences, specialization area, and interaction history for tailored responses |
| 🤖 Multi-Agent System | Specialized agents for different radiology subspecialties (neuroradiology, musculoskeletal, cardiothoracic) working collaboratively |
| 📊 Analytics Dashboard | Usage analytics, query patterns, and knowledge gap identification for continuous improvement |
| 🔒 HIPAA Compliance | Enterprise-grade security features for deployment in clinical environments |
| Metric | Target |
|---|---|
| Citation verification | < 15 seconds |
| Vector retrieval latency | < 2 seconds |
| External API calls | Fully async with configurable timeouts |
| Embedding operations | Batch processing with LRU cache (2,048 entries) |
| Agent tool iterations | Maximum 5 per query |
Backend: Python · FastAPI · LangChain · Pydantic · AsyncIO · Qdrant · structlog Frontend: Next.js · React · TypeScript · Tailwind CSS · react-markdown AI/LLM: Claude · Gemini · Groq Embeddings: Cohere · OpenAI Search APIs: PubMed · Semantic Scholar · Tavily · Exa.ai Infrastructure: Docker · Docker Compose · GitHub Actions CI
This project is licensed under the MIT License. See the LICENSE file for details.
Built with ❤️ for the radiology community