# Development guide

This guide provides comprehensive information for developers working on the
CodeGate project.

## Project overview

CodeGate is a configurable generative AI gateway designed to protect developers
from potential AI-related security risks. Key features include:

- Secrets exfiltration prevention
- Secure coding recommendations
- Prevention of AI recommending deprecated/malicious libraries
- Modular system prompts configuration
- Multiple AI provider support with configurable endpoints

## Development setup

### Prerequisites

- Python 3.12 or higher
- [Poetry](https://python-poetry.org/docs/#installation) for dependency
  management
- [Docker](https://docs.docker.com/get-docker/) or
  [Podman](https://podman.io/getting-started/installation) (for containerized
  deployment)
- [Visual Studio Code](https://code.visualstudio.com/download) (recommended IDE)

Note that if you are using pyenv on macOS, you will need a Python build linked
against sqlite installed from Homebrew. macOS ships with sqlite, but it lacks
some required functionality needed in the project. This can be accomplished with:

```
# substitute for your version of choice
PYTHON_VERSION=3.12.9
brew install sqlite
LDFLAGS="-L$(brew --prefix sqlite)/lib" CPPFLAGS="-I$(brew --prefix sqlite)/include" PYTHON_CONFIGURE_OPTS="--enable-loadable-sqlite-extensions" pyenv install -v $PYTHON_VERSION
poetry env use $PYTHON_VERSION
```

### Initial setup

1. Clone the repository:

   ```bash
   git clone https://github.com/stacklok/codegate.git
   cd codegate
   ```

2. Install Poetry following the
   [official installation guide](https://python-poetry.org/docs/#installation)

3. Install project dependencies:

   ```bash
   poetry install --with dev
   ```

## Dashboard CodeGate UI 
### Setting up local development environment

Clone the repository
   ```bash
      git clone https://github.com/stacklok/codegate-ui
      cd codegate-ui
   ```

To install all dependencies for your local development environment, run

```bash
npm install
```

### Running the development server

Run the development server using:

```bash
npm run dev
```

### Build production

Run the build command:

```bash
npm run build
```

### Running production build

Run the production build command:

```bash
npm run preview
```

## Project structure

```plain
codegate/
├── pyproject.toml    # Project configuration and dependencies
├── poetry.lock      # Lock file (committed to version control)
├── prompts/         # System prompts configuration
│   └── default.yaml # Default system prompts
├── src/
│   └── codegate/    # Source code
│       ├── __init__.py
│       ├── cli.py           # Command-line interface
│       ├── config.py        # Configuration management
│       ├── exceptions.py    # Shared exceptions
│       ├── codegate_logging.py       # Logging setup
│       ├── prompts.py       # Prompts management
│       ├── server.py        # Main server implementation
│       └── providers/       # External service providers
│           ├── anthropic/   # Anthropic provider implementation
│           ├── openai/      # OpenAI provider implementation
│           ├── vllm/        # vLLM provider implementation
│           └── base.py      # Base provider interface
├── tests/           # Test files
└── docs/            # Documentation
```

## Development workflow

### 1. Environment management

Poetry commands for managing your development environment:

- `poetry install`: Install project dependencies
- `poetry add package-name`: Add a new package dependency
- `poetry add --group dev package-name`: Add a development dependency
- `poetry remove package-name`: Remove a package
- `poetry update`: Update dependencies to their latest versions
- `poetry show`: List all installed packages
- `poetry env info`: Show information about the virtual environment

### 2. Code style and quality

The project uses several tools to maintain code quality:

- [**Black**](https://black.readthedocs.io/en/stable/) for code formatting:

  ```bash
  poetry run black .
  ```

- [**Ruff**](https://docs.astral.sh/ruff/) for linting:

  ```bash
  poetry run ruff check .
  ```

- [**Bandit**](https://bandit.readthedocs.io/) for security checks:

  ```bash
  poetry run bandit -r src/
  ```

### 3. Testing

#### Unit Tests
To run the unit test suite with coverage:

```bash
poetry run pytest
```

Tests are located in the `tests/` directory and follow the same structure as the
source code.

#### Integration Tests
To run the integration tests, create a `.env` file in the repo root directory and add the
following properties to it:
```
ENV_OPENAI_KEY=<YOUR_KEY>
ENV_VLLM_KEY=<YOUR_KEY>
ENV_ANTHROPIC_KEY=<YOUR_KEY>
```

Next, run import_packages to ensure integration test data is created: 
```bash
poetry run python scripts/import_packages.py
```

Next, start the CodeGate server:
```bash
poetry run codegate serve --log-level DEBUG --log-format TEXT
```

Then the integration tests can be executed by running:
```bash
poetry run python tests/integration/integration_tests.py
```

You can include additional properties to specify test scope and other information. For instance, to execute the tests for Copilot providers, for instance, run:
```bash
CODEGATE_PROVIDERS=copilot CA_CERT_FILE=./codegate_volume/certs/ca.crt poetry run python tests/integration/integration_tests.py
```

### 4. Make commands

The project includes a Makefile for common development tasks:

- `make install`: install all dependencies
- `make format`: format code using black and ruff
- `make lint`: run linting checks
- `make test`: run tests with coverage
- `make security`: run security checks
- `make build`: build distribution packages
- `make all`: run all checks and build (recommended before committing)

## 🐳 Docker deployment

### Build the image

```bash
make image-build
```

### Run the container

```bash
# Basic usage with local image
docker run -p 8989:8989 -p 9090:9090 codegate:latest

# With pre-built pulled image
docker pull ghcr.io/stacklok/codegate:latest
docker run --name codegate -d -p 8989:8989 -p 9090:9090 ghcr.io/stacklok/codegate:latest

# It will mount a volume to /app/codegate_volume
# The directory supports storing Llama CPP models under subdirectory /models
# A sqlite DB with the messages and alerts is stored under the subdirectory /db
docker run --name codegate -d -v /path/to/volume:/app/codegate_volume -p 8989:8989 -p 9090:9090 ghcr.io/stacklok/codegate:latest
```

### Exposed parameters

- CODEGATE_VLLM_URL: URL for the inference engine (defaults to
  [https://inference.codegate.ai](https://inference.codegate.ai))
- CODEGATE_OPENAI_URL: URL for OpenAI inference engine (defaults to
  [https://api.openai.com/v1](https://api.openai.com/v1))
- CODEGATE_ANTHROPIC_URL: URL for Anthropic inference engine (defaults to
  [https://api.anthropic.com/v1](https://api.anthropic.com/v1))
- CODEGATE_OLLAMA_URL: URL for OLlama inference engine (defaults to
  [http://localhost:11434/api](http://localhost:11434/api))
- CODEGATE_LM_STUDIO_URL: URL for LM Studio inference engine (defaults to
  [http://localhost:1234/api](http://localhost:1234/api))
- CODEGATE_APP_LOG_LEVEL: Level of debug desired when running the codegate
  server (defaults to WARNING, can be ERROR/WARNING/INFO/DEBUG)
- CODEGATE_LOG_FORMAT: Type of log formatting desired when running the codegate
  server (default to TEXT, can be JSON/TEXT)

```bash
docker run -p 8989:8989 -p 9090:9090 -e CODEGATE_OLLAMA_URL=http://1.2.3.4:11434/api ghcr.io/stacklok/codegate:latest
```

## Configuration system

CodeGate uses a hierarchical configuration system with the following priority
(highest to lowest):

1. CLI arguments
2. Environment variables
3. Config file (YAML)
4. Default values (including default prompts)

### Configuration options

- Port: server port (default: `8989`)
- Host: server host (default: `"localhost"`)
- Log level: logging verbosity level (`ERROR`|`WARNING`|`INFO`|`DEBUG`)
- Log format: log format (`JSON`|`TEXT`)
- Prompts: system prompts configuration
- Provider URLs: AI provider endpoint configuration

See [Configuration system](configuration.md) for detailed information.

## Working with providers

CodeGate supports multiple AI providers through a modular provider system.

### Available providers

1. **vLLM provider**

   - Default URL: `http://localhost:8000`
   - Supports OpenAI-compatible APIs
   - Automatically adds `/v1` path to base URL
   - Model names are prefixed with `hosted_vllm/`

2. **OpenAI provider**

   - Default URL: `https://api.openai.com/v1`
   - Standard OpenAI API implementation

3. **Anthropic provider**

   - Default URL: `https://api.anthropic.com/v1`
   - Anthropic Claude API implementation

4. **Ollama provider**
   - Default URL: `http://localhost:11434`
   - Endpoints:
     - Native Ollama API: `/ollama/api/chat`
     - OpenAI-compatible: `/ollama/chat/completions`

### Configuring providers

Provider URLs can be configured through:

1. Config file (config.yaml):

   ```yaml
   provider_urls:
     vllm: "https://vllm.example.com"
     openai: "https://api.openai.com/v1"
     anthropic: "https://api.anthropic.com/v1"
     ollama: "http://localhost:11434" # /api path added automatically
   ```

2. Environment variables:

   ```bash
   export CODEGATE_PROVIDER_VLLM_URL=https://vllm.example.com
   export CODEGATE_PROVIDER_OPENAI_URL=https://api.openai.com/v1
   export CODEGATE_PROVIDER_ANTHROPIC_URL=https://api.anthropic.com/v1
   export CODEGATE_PROVIDER_OLLAMA_URL=http://localhost:11434
   export CODEGATE_PROVIDER_LM_STUDIO_URL=http://localhost:1234
   ```

3. CLI flags:

   ```bash
   codegate serve --vllm-url https://vllm.example.com --ollama-url http://localhost:11434
   ```

### Implementing new providers

To add a new provider:

1. Create a new directory in `src/codegate/providers/`
2. Implement required components:
   - `provider.py`: Main provider class extending BaseProvider
   - `adapter.py`: Input/output normalizers
   - `__init__.py`: Export provider class

Example structure:

```python
from codegate.providers.base import BaseProvider

class NewProvider(BaseProvider):
    def __init__(self, ...):
        super().__init__(
            InputNormalizer(),
            OutputNormalizer(),
            completion_handler,
            pipeline_processor,
            fim_pipeline_processor
        )

    @property
    def provider_route_name(self) -> str:
        return "provider_name"

    def _setup_routes(self):
        # Implement route setup
        pass
```

## Working with prompts

### Default prompts

Default prompts are stored in `prompts/default.yaml`. These prompts are loaded
automatically when no other prompts are specified.

### Creating custom prompts

1. Create a new YAML file following the format:

   ```yaml
   prompt_name: "Prompt text content"
   another_prompt: "More prompt text"
   ```

2. Use the prompts file:

   ```bash
   # Via CLI
   codegate serve --prompts my-prompts.yaml

   # Via config.yaml
   prompts: "path/to/prompts.yaml"

   # Via environment
   export CODEGATE_PROMPTS_FILE=path/to/prompts.yaml
   ```

### Testing prompts

1. View loaded prompts:

   ```bash
   # Show default prompts
   codegate show-prompts

   # Show custom prompts
   codegate show-prompts --prompts my-prompts.yaml
   ```

2. Write tests for prompt functionality:

   ```python
   def test_custom_prompts():
       config = Config.load(prompts_path="path/to/test/prompts.yaml")
       assert config.prompts.my_prompt == "Expected prompt text"
   ```

## CLI interface

The main command-line interface is implemented in `cli.py`. Basic usage:

```bash
# Start server with default settings
codegate serve

# Start with custom configuration
codegate serve --port 8989 --host localhost --log-level DEBUG

# Start with custom prompts
codegate serve --prompts my-prompts.yaml

# Start with custom provider URL
codegate serve --vllm-url https://vllm.example.com
```

See [CLI commands and flags](cli.md) for detailed command information.