Skip to content

Avinashbudige/Multi_Agent_RAG_Systems

Repository files navigation

Enterprise AI/ML GenAI Platform

A production-ready, scalable multi-agent RAG system demonstrating advanced ML/GenAI capabilities for enterprise applications.

🎯 Project Overview

This project showcases a comprehensive AI/ML platform that includes:

  • Multi-Agent RAG System: Advanced retrieval-augmented generation with orchestrated agents
  • LLM Fine-tuning Pipeline: LoRA/PEFT-based fine-tuning for enterprise models
  • Production APIs: Robust FastAPI services with monitoring and governance
  • MLOps Integration: Full CI/CD pipeline with drift detection and auto-retraining
  • Cloud-Native Architecture: Deployment configs for Azure OpenAI and AWS Bedrock
  • Vector Database Integration: Support for Pinecone, Milvus, and Elasticsearch
  • Responsible AI: Built-in governance, explainability, and ethical AI practices

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      API Gateway (FastAPI)                  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Multi-Agent Orchestration Layer (LangGraph/AutoGen)       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Research     β”‚ Code         β”‚ Analytics    β”‚ Orchestrator  β”‚
β”‚ Agent        β”‚ Agent        β”‚ Agent        β”‚ Agent         β”‚
β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜
       β”‚              β”‚              β”‚               β”‚
       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      β”‚
       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β”‚   RAG Pipeline Engine       β”‚
       β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
       β”‚ - Document Processing       β”‚
       β”‚ - Embedding Generation      β”‚
       β”‚ - Vector Search             β”‚
       β”‚ - Context Retrieval         β”‚
       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                      β”‚
       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β”‚   LLM Layer                 β”‚
       β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
       β”‚ - GPT-4/GPT-3.5            β”‚
       β”‚ - Llama 3/3.1              β”‚
       β”‚ - Mistral                  β”‚
       β”‚ - Fine-tuned Models        β”‚
       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ Project Structure

.
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ agents/              # Multi-agent framework
β”‚   β”œβ”€β”€ rag/                 # RAG pipeline components
β”‚   β”œβ”€β”€ models/              # Model definitions and fine-tuning
β”‚   β”œβ”€β”€ api/                 # FastAPI services
β”‚   β”œβ”€β”€ mlops/               # MLOps utilities
β”‚   β”œβ”€β”€ cloud/               # Cloud integrations
β”‚   └── utils/               # Shared utilities
β”œβ”€β”€ config/                  # Configuration files
β”œβ”€β”€ deployment/              # Kubernetes, Docker configs
β”œβ”€β”€ notebooks/               # Jupyter notebooks for experimentation
β”œβ”€β”€ tests/                   # Unit and integration tests
β”œβ”€β”€ data/                    # Sample data and artifacts
β”œβ”€β”€ models/                  # Trained model artifacts
└── docs/                    # Additional documentation

πŸš€ Features

1. Multi-Agent Framework

  • Agentic Workflows: Tool-augmented reasoning and orchestration
  • Memory Management: Persistent and contextual memory
  • Agent Collaboration: Dynamic task routing and coordination
  • Frameworks: LangGraph, AutoGen, CrewAI integration

2. RAG Pipeline

  • Document Processing: Multiple format support (PDF, DOCX, HTML, MD)
  • Chunking Strategies: Semantic, recursive, and custom chunking
  • Embeddings: OpenAI, Hugging Face, Azure OpenAI
  • Vector Databases: Pinecone, Milvus, Elasticsearch, Chroma
  • Hybrid Search: Dense + sparse retrieval

3. LLM Fine-tuning

  • LoRA/QLoRA: Parameter-efficient fine-tuning
  • PEFT Methods: Prefix tuning, adapter layers
  • Quantization: 4-bit, 8-bit quantization support
  • Models: Llama 3, Mistral, GPT variants

4. Production APIs

  • FastAPI: High-performance async APIs
  • Authentication: JWT, API keys, OAuth2
  • Rate Limiting: Token bucket algorithm
  • Monitoring: Prometheus, Grafana integration
  • Governance: Request logging, audit trails

5. MLOps

  • CI/CD: GitHub Actions, Azure DevOps
  • Model Versioning: MLflow integration
  • Drift Detection: Statistical and performance-based
  • Auto-Retraining: Scheduled and trigger-based
  • A/B Testing: Model comparison framework

6. Cloud Deployment

  • Azure OpenAI: Seamless integration
  • AWS Bedrock: Multi-model support
  • Kubernetes: Production-grade orchestration
  • Docker: Multi-stage optimized builds

πŸ› οΈ Technology Stack

Core ML/AI

  • Python: 3.11+
  • PyTorch: 2.x
  • Transformers: Hugging Face
  • LangChain: 0.1.x
  • LangGraph: Latest
  • AutoGen: Latest
  • CrewAI: Latest

Vector Databases

  • Pinecone
  • Milvus
  • Elasticsearch
  • ChromaDB

MLOps & Infrastructure

  • MLflow
  • Weights & Biases
  • Docker & Kubernetes
  • Prometheus & Grafana
  • Redis (caching)
  • PostgreSQL (metadata)

Cloud Platforms

  • Azure OpenAI Service
  • AWS Bedrock
  • Azure ML
  • AWS SageMaker

πŸ“¦ Installation

Prerequisites

  • Python 3.11+
  • Docker Desktop
  • Kubernetes (minikube or cloud cluster)
  • Azure/AWS CLI (for cloud deployment)

Local Setup

# Clone the repository
git clone <repo-url>
cd ericson

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Install development dependencies
pip install -r requirements-dev.txt

# Set up environment variables
cp .env.example .env
# Edit .env with your API keys and configurations

# Initialize vector database
python scripts/setup_vectordb.py

# Run database migrations
alembic upgrade head

πŸ”§ Configuration

Create a .env file with the following:

# LLM Providers
OPENAI_API_KEY=your_key_here
AZURE_OPENAI_KEY=your_key_here
AZURE_OPENAI_ENDPOINT=your_endpoint_here
AWS_ACCESS_KEY_ID=your_key_here
AWS_SECRET_ACCESS_KEY=your_secret_here

# Vector Databases
PINECONE_API_KEY=your_key_here
PINECONE_ENVIRONMENT=your_env_here
MILVUS_HOST=localhost
MILVUS_PORT=19530

# MLOps
MLFLOW_TRACKING_URI=http://localhost:5000
WANDB_API_KEY=your_key_here

# Application
API_HOST=0.0.0.0
API_PORT=8000
ENVIRONMENT=development

🎯 Usage

Running the API Server

# Development mode with hot reload
uvicorn src.api.main:app --reload --host 0.0.0.0 --port 8000

# Production mode
gunicorn src.api.main:app -w 4 -k uvicorn.workers.UvicornWorker

Using the Multi-Agent System

from src.agents.orchestrator import AgentOrchestrator
from src.rag.pipeline import RAGPipeline

# Initialize RAG pipeline
rag = RAGPipeline(
    vector_db="pinecone",
    embedding_model="text-embedding-3-large"
)

# Create agent orchestrator
orchestrator = AgentOrchestrator(rag_pipeline=rag)

# Execute multi-agent workflow
result = orchestrator.execute(
    query="Analyze quarterly revenue trends and generate insights",
    agents=["research", "analytics", "report_writer"]
)

print(result)

Fine-tuning an LLM

# Fine-tune Llama 3 with LoRA
python src/models/finetune.py \
    --model_name meta-llama/Llama-3-8b \
    --dataset data/training_data.json \
    --method lora \
    --rank 8 \
    --alpha 16 \
    --epochs 3

RAG Query

from src.rag.query_engine import QueryEngine

engine = QueryEngine()

response = engine.query(
    question="What are the key features of our product?",
    top_k=5,
    rerank=True
)

print(f"Answer: {response.answer}")
print(f"Sources: {response.sources}")

πŸ§ͺ Testing

# Run all tests
pytest tests/

# Run with coverage
pytest tests/ --cov=src --cov-report=html

# Run specific test suite
pytest tests/test_agents.py -v

πŸš€ Deployment

Docker Deployment

# Build image
docker build -t aiml-platform:latest .

# Run container
docker run -p 8000:8000 --env-file .env aiml-platform:latest

Kubernetes Deployment

# Apply configurations
kubectl apply -f deployment/k8s/namespace.yaml
kubectl apply -f deployment/k8s/configmap.yaml
kubectl apply -f deployment/k8s/secrets.yaml
kubectl apply -f deployment/k8s/deployment.yaml
kubectl apply -f deployment/k8s/service.yaml

# Check status
kubectl get pods -n aiml-platform

Cloud Deployment

Azure:

# Deploy to Azure Container Apps
az containerapp up \
    --name aiml-platform \
    --resource-group aiml-rg \
    --location eastus \
    --environment aiml-env \
    --image <your-acr>.azurecr.io/aiml-platform:latest

AWS:

# Deploy to ECS
aws ecs create-service \
    --cluster aiml-cluster \
    --service-name aiml-platform \
    --task-definition aiml-platform:1 \
    --desired-count 3

πŸ“Š Monitoring

Access monitoring dashboards:

πŸ” Responsible AI

This project implements:

  • Bias Detection: Automated fairness testing
  • Explainability: SHAP, LIME integration
  • Privacy: PII detection and redaction
  • Governance: Audit logging and compliance tracking
  • Content Safety: Azure Content Safety integration

πŸ“š Documentation

🀝 Contributing

See CONTRIBUTING.md for guidelines.

πŸ“„ License

MIT License - See LICENSE file

πŸ‘₯ Contact

For questions or support, reach out to the development team.


Built with ❀️ for Enterprise AI/ML Applications

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors