Skip to content

Latest commit

 

History

History
377 lines (314 loc) · 9.72 KB

File metadata and controls

377 lines (314 loc) · 9.72 KB

ACE RAG Gemini - Complete File Index

📁 Project Structure Overview

Total Files: 25 Total Lines of Code: ~4,200+


📦 Core Package (ace_rag/) - 12 files

Configuration & Foundation

  • __init__.py (26 lines)

    • Package initialization
    • Version information
    • Public API exports
  • config.py (269 lines)

    • GeminiConfig - API configuration
    • VectorStoreConfig - Vector store settings
    • DocumentProcessorConfig - Processing parameters
    • ACEConfig - ACE framework settings
    • Config - Main configuration class
    • Environment-based configuration loading
  • exceptions.py (35 lines)

    • ACERAGException - Base exception
    • GeminiAPIException - API errors
    • VectorStoreException - Vector store errors
    • DocumentProcessingException - Processing errors
    • PlaybookException - Playbook errors
    • ConfigurationException - Config errors
  • models.py (284 lines)

    • Document - Document model
    • Chunk - Chunk with embeddings
    • RetrievalResult - Search results
    • QueryTrajectory - ACE trajectory
    • ReflectionInsight - ACE insights
    • PlaybookStrategy - Strategy model
    • PlaybookDelta - Delta updates
    • RAGResponse - Complete response
    • FusionMethod - Fusion enum

External Integration

  • gemini_client.py (348 lines)
    • GeminiClient - Main client class
    • RateLimiter - Token bucket rate limiting
    • CircuitBreaker - Fault tolerance
    • Retry logic with exponential backoff
    • Batch embedding operations
    • Query expansion

Data Storage

  • vector_store.py (423 lines)

    • VectorStore - FAISS wrapper
    • Index management (Flat, IVF)
    • Similarity search
    • Metadata filtering
    • Persistence (save/load)
    • Statistics tracking
  • document_processor.py (331 lines)

    • DocumentProcessor - Main processor
    • Multi-format support (TXT, MD, PDF)
    • Semantic chunking with overlap
    • Sentence-aware splitting
    • Metadata extraction
    • Batch embedding generation

ACE Framework

  • playbook.py (432 lines)

    • Playbook - Strategy storage
    • Strategy CRUD operations
    • Performance tracking
    • Usage statistics
    • Delta-based updates
    • Automatic pruning
    • JSON persistence
  • ace_generator.py (297 lines)

    • ACEGenerator - Generator component
    • Diverse trajectory generation
    • Strategy selection from playbook
    • Temperature variation
    • Fusion method selection
    • Query expansion integration
  • ace_reflector.py (465 lines)

    • ACEReflector - Reflector component
    • Trajectory quality scoring
    • Pattern analysis
    • Insight extraction
    • Multi-dimensional evaluation
    • Performance comparison
  • ace_curator.py (385 lines)

    • ACECurator - Curator component
    • Insight validation
    • Semantic deduplication
    • Playbook evolution
    • Strategy creation/updates
    • Automatic pruning coordination

Main Orchestrator

  • rag_engine.py (363 lines)
    • RAGEngine - Main orchestrator
    • Component initialization
    • Document ingestion API
    • Query processing (with/without ACE)
    • Answer generation
    • Vector store management
    • Statistics tracking

📚 Examples (examples/) - 2 files

  • basic_usage.py (161 lines)

    • System initialization demo
    • Document ingestion examples
    • Simple query execution
    • ACE-enabled queries
    • Result visualization
    • Statistics display
  • ace_evolution_demo.py (207 lines)

    • Knowledge base setup
    • Multiple query execution
    • Performance tracking over time
    • Learning demonstration
    • Insight generation visualization
    • Strategy evolution analysis

🧪 Tests (tests/) - 3 files

  • __init__.py (empty)

    • Test package initialization
  • test_models.py (94 lines)

    • Document model tests
    • Chunk model tests
    • QueryTrajectory tests
    • ReflectionInsight tests
    • PlaybookStrategy tests
    • FusionMethod enum tests
  • test_config.py (87 lines)

    • GeminiConfig validation tests
    • VectorStoreConfig tests
    • ACEConfig tests
    • Config.from_env() tests
    • Error handling tests

📖 Documentation - 5 files

  • README.md (400+ lines)

    • Project overview
    • Feature list
    • Installation guide
    • Quick start
    • Architecture diagrams
    • Configuration reference
    • Advanced usage
    • API documentation
    • Troubleshooting
    • Performance metrics
    • Contributing guidelines
  • PROJECT_SUMMARY.md (350+ lines)

    • Implementation overview
    • Component descriptions
    • Architecture details
    • File structure
    • Key features
    • Usage reference
    • Performance characteristics
    • Next steps
  • IMPLEMENTATION_CHECKLIST.md (500+ lines)

    • Phase-by-phase breakdown
    • Component checklist
    • Feature tracking
    • Verification steps
    • Summary statistics
  • DELIVERY_REPORT.md (450+ lines)

    • Executive summary
    • Deliverables list
    • Technical implementation
    • Code quality metrics
    • Performance characteristics
    • Testing & verification
    • Deployment considerations
  • INDEX.md (this file)

    • Complete file index
    • Purpose descriptions
    • Quick navigation

⚙️ Configuration - 3 files

  • requirements.txt (18 lines)

    • Core dependencies
    • Document processing
    • Development tools
    • Version specifications
  • .env.example (18 lines)

    • Gemini API configuration
    • Vector store settings
    • Document processing config
    • ACE framework settings
    • Application settings
  • setup.py (60 lines)

    • Package metadata
    • Dependencies
    • Installation configuration
    • Development extras
    • Python version requirements

🔧 Utilities - 2 files

  • verify_installation.py (200+ lines)

    • Import verification
    • Dependency checking
    • Configuration validation
    • Model testing
    • Directory verification
    • Summary reporting
  • quickstart.sh (80+ lines)

    • Automated setup script
    • Virtual environment creation
    • Dependency installation
    • Configuration setup
    • Installation verification

📊 Statistics Summary

By Category

Core Package:        3,658 lines (12 files)
Examples:              368 lines (2 files)
Tests:                 181 lines (2 files)
Documentation:      ~1,700 lines (5 files)
Configuration:          96 lines (3 files)
Utilities:            ~280 lines (2 files)
────────────────────────────────────
Total:              ~4,300 lines (26 files)

By Component Type

Configuration:       304 lines (config.py, exceptions.py)
Models:              284 lines (models.py)
External APIs:       348 lines (gemini_client.py)
Storage:             754 lines (vector_store.py, document_processor.py)
ACE Framework:     1,579 lines (playbook.py, generator, reflector, curator)
Orchestration:       363 lines (rag_engine.py)
Examples:            368 lines
Tests:               181 lines
Documentation:     1,700+ lines

🗺️ Quick Navigation

Getting Started

  1. Read README.md for overview
  2. Run ./quickstart.sh for setup
  3. Try examples/basic_usage.py
  4. Review PROJECT_SUMMARY.md

Understanding the Code

  1. Start with ace_rag/__init__.py
  2. Review models.py for data structures
  3. Examine rag_engine.py for orchestration
  4. Study ACE components:

Configuration

  1. Copy .env.example to .env
  2. Review config.py for settings
  3. Check requirements.txt for dependencies

Testing

  1. Review tests/test_models.py
  2. Review tests/test_config.py
  3. Run pytest for all tests
  4. Run verify_installation.py

Extending

  1. Study component interfaces in models.py
  2. Review configuration in config.py
  3. Examine strategy system in playbook.py
  4. Check examples for usage patterns

🔍 Component Dependencies

Dependency Graph

rag_engine.py
├── config.py
├── models.py
├── gemini_client.py
├── vector_store.py
├── document_processor.py
├── playbook.py
├── ace_generator.py
│   ├── gemini_client.py
│   ├── vector_store.py
│   └── playbook.py
├── ace_reflector.py
│   └── gemini_client.py
└── ace_curator.py
    ├── playbook.py
    └── gemini_client.py

External Dependencies

Core:
- google-generativeai (Gemini API)
- pydantic (Validation)
- python-dotenv (Environment)
- faiss-cpu (Vector search)
- numpy (Numerical)

Processing:
- PyPDF2 (PDF support)

Development:
- pytest (Testing)
- black (Formatting)
- flake8 (Linting)
- mypy (Type checking)

📝 File Purpose Quick Reference

File Primary Purpose Key Classes/Functions
config.py Configuration management Config, GeminiConfig, ACEConfig
models.py Data structures Document, Chunk, QueryTrajectory
gemini_client.py API integration GeminiClient, RateLimiter
vector_store.py Vector search VectorStore
document_processor.py Document processing DocumentProcessor
playbook.py Strategy storage Playbook
ace_generator.py Trajectory generation ACEGenerator
ace_reflector.py Quality analysis ACEReflector
ace_curator.py Playbook evolution ACECurator
rag_engine.py System orchestration RAGEngine

Last Updated: October 11, 2025 Version: 0.1.0 Status: Production Ready ✅