Important
Author: [Carnegie Johnson/IAYF Consulting]
License: MIT (Attribution Required)
This repository demonstrates the Backend Architecture for a secure Clinical Decision Support Agent. Designed for high-compliance healthcare environments, it integrates Microsoft Fabric (OneLake) for data storage with Azure AI for secure inference.
Unlike standard chatbots, this architecture prioritizes:
- Data Lineage: Traceable ETL pipelines from public sources (cBioPortal) to Silver Delta Tables.
- Clinical Validity: Statistical checks for "Artificial Capping" and outliers before inference.
- Safety First: A dedicated Middleware layer that sanitizes PII (Protected Health Information) before it reaches the LLM.
In Data Engineering, best practice is the Medallion Architecture:
- 🥉 Bronze Layer (Raw): Raw
.tar.gzfiles sitting in a folder. They are hard to query and "messy." - 🥈 Silver Layer (Clean): Clean the headers, and organize them into Delta Tables (high-performance SQL tables).
- 🥇 Gold Layer (Curated): Aggregated data ready for dashboards and AI agents.
- Ingestion Layer (Phase 1): Python scripts fetch raw
.tar.gzarchives from cBioPortal and stream them into Microsoft OneLake.
|
|
- Processing Layer (Fabric): PySpark notebooks transform raw files into queryable Delta Tables (Silver Layer).
|
|
-
Analysis Layer (Phase 2): Local Python (VS Code) connects via ODBC/SQL to validate data distributions.

-
Safety Layer (Phase 3): A Hybrid Guardrails system uses Regex + Azure Content Safety to block toxic or PII-laden prompts.
- Azure account
- Microsoft Fabric workspace
- Python 3.10+
- Visual Studio Code (Jupyter notebooks)
- Azure CLI: Run
az loginto authenticate. - ODBC Driver: You MUST install the ODBC Driver 18 for SQL Server for Phase 2 to work.
To get started quickly, we provide a template for your environment variables.
-
Clone the repo:
git clone [https://github.com/CarnegieJ/onc-clinical-intel-agent/](https://github.com/CarnegieJ/onc-clinical-intel-agent)[CarnegieJ]/clinical-intelligence-agent.git cd clinical-intelligence-agent -
Configure Secrets:
- Locate the file named
.env.templatein the root directory. - Rename it to
.env. - Open it and fill in your specific Azure values (Workspace ID, Connection Strings, etc.).
# Example Command (Terminal) cp .env.template .env - Locate the file named




