DEVELOPER_GUIDE.md

LLM Verifier Developer Guide

Architecture Overview

LLM Verifier is a comprehensive Go application for verifying and benchmarking Large Language Models across multiple providers. The system provides automated testing, performance scoring, and configuration export capabilities.

Core Components

llm-verifier/
├── cmd/                    # Command-line interface
├── llmverifier/           # Core business logic
│   ├── config_export.go   # Configuration export functionality
│   ├── verifier.go        # Model verification engine
│   ├── analytics.go       # Analytics and monitoring
│   └── migration.go       # Configuration migration tools
├── providers/             # Provider-specific implementations
├── database/              # Data persistence layer
├── logging/               # Structured logging
└── tests/                 # Comprehensive test suite

Key Design Patterns

Dependency Injection: Services accept dependencies through interfaces
Strategy Pattern: Different verification strategies for different providers
Observer Pattern: Event-driven architecture for monitoring
Factory Pattern: Provider and service instantiation
Repository Pattern: Data access abstraction

Getting Started with Development

Prerequisites

Go 1.21 or later
Git
Make (optional, for build automation)
Docker (for integration testing)

Development Setup

Clone and setup:

git clone https://github.com/your-org/llm-verifier.git
cd llm-verifier
go mod download

Run tests:
```
go test ./... -v
```
Build the application:
```
go build ./cmd/main.go -o llm-verifier
```

Run in development mode:

export LLM_VERIFIER_DEBUG=true
./llm-verifier --help

Development Workflow

Create a feature branch: git checkout -b feature/your-feature
Make changes with tests
Run full test suite: go test ./... -race -cover
Update documentation if needed
Submit pull request

Core Concepts

Verification Process

The verification process consists of several stages:

Discovery: Identify available models from providers
Preparation: Set up test scenarios and prompts
Execution: Run tests against each model
Scoring: Calculate performance metrics
Reporting: Generate comprehensive reports

Scoring Algorithm

Models are scored across multiple dimensions:

type PerformanceScore struct {
    OverallScore      float64 // Weighted average of all metrics
    CodeCapability    float64 // Code generation and analysis ability
    Responsiveness    float64 // API response times
    Reliability       float64 // Error rates and consistency
    FeatureRichness   float64 // Advanced feature support
    ValueProposition  float64 // Cost vs. performance ratio
}

Scoring Weights:

Code Capability: 25%
Responsiveness: 20%
Reliability: 20%
Feature Richness: 20%
Value Proposition: 15%

Provider Architecture

Each provider implements the Provider interface:

type Provider interface {
    SendMessages(ctx context.Context, messages []message.Message, tools []tools.BaseTool) (*ProviderResponse, error)
    StreamResponse(ctx context.Context, messages []message.Message, tools []tools.BaseTool) <-chan ProviderEvent
    Model() models.Model
}

Supported Providers:

OpenAI (GPT-3.5, GPT-4, GPT-4o)
Anthropic (Claude models)
Google (Gemini)
Groq (Fast inference)
Together AI
Fireworks AI
And 15+ more providers

Adding New Providers

Step 1: Define Provider Structure

Create a new provider file in providers/:

// providers/custom_provider.go
package providers

type CustomProvider struct {
    apiKey     string
    baseURL    string
    model      models.Model
    httpClient *http.Client
}

func NewCustomProvider(apiKey, baseURL string, model models.Model) *CustomProvider {
    return &CustomProvider{
        apiKey:     apiKey,
        baseURL:    baseURL,
        model:      model,
        httpClient: &http.Client{Timeout: 30 * time.Second},
    }
}

Step 2: Implement Provider Interface

func (p *CustomProvider) SendMessages(ctx context.Context, messages []message.Message, tools []tools.BaseTool) (*ProviderResponse, error) {
    // Convert messages to provider format
    requestBody := p.convertMessages(messages)

    // Add tools if supported
    if len(tools) > 0 {
        requestBody.Tools = p.convertTools(tools)
    }

    // Make API request
    resp, err := p.makeRequest(ctx, "POST", "/chat/completions", requestBody)
    if err != nil {
        return nil, fmt.Errorf("Custom provider request failed: %w", err)
    }

    // Parse response
    return p.parseResponse(resp)
}

func (p *CustomProvider) StreamResponse(ctx context.Context, messages []message.Message, tools []tools.BaseTool) <-chan ProviderEvent {
    // Implement streaming if supported
    ch := make(chan ProviderEvent)

    go func() {
        defer close(ch)
        // Streaming implementation
    }()

    return ch
}

func (p *CustomProvider) Model() models.Model {
    return p.model
}

Step 3: Add to Provider Factory

Update config_export.go to include the new provider:

func NewProvider(providerName models.ModelProvider, opts ...ProviderClientOption) (Provider, error) {
    // ... existing cases ...

    case models.ProviderCustom:
        return &baseProvider[CustomClient]{
            options: clientOptions,
            client:  newCustomClient(clientOptions),
        }, nil

    // ... rest of cases ...
}

Step 4: Update Provider Detection

Add custom provider detection in extractProvider():

func extractProvider(endpoint string) string {
    // ... existing patterns ...

    if strings.Contains(endpoint, "custom-api.com") {
        return "custom"
    }

    // ... existing logic ...
}

Step 5: Add Tests

Create comprehensive tests:

// providers/custom_provider_test.go
func TestCustomProvider_SendMessages(t *testing.T) {
    // Test message sending
}

func TestCustomProvider_StreamResponse(t *testing.T) {
    // Test streaming functionality
}

func TestCustomProvider_ErrorHandling(t *testing.T) {
    // Test error scenarios
}

Step 6: Update Documentation

Add provider documentation to user manual and API reference.

Testing Strategy

Test Types

Unit Tests: Individual function/component testing
Integration Tests: Component interaction testing
End-to-End Tests: Complete workflow testing
Performance Tests: Load and performance benchmarking
Security Tests: Vulnerability and sanitization testing

Test Organization

tests/
├── unit/              # Unit tests (90%+ coverage)
├── integration/       # Integration tests
├── e2e/              # End-to-end tests
├── performance/      # Performance benchmarks
├── security/         # Security validation
└── compatibility/    # Cross-platform testing

Running Tests

Full test suite:

go test ./... -v -race -cover

Specific test categories:

# Unit tests only
go test ./llmverifier -v -short

# Integration tests
go test ./tests/integration -v

# Performance benchmarks
go test -bench=. -benchmem ./...

# Coverage report
go test ./... -coverprofile=coverage.out
go tool cover -html=coverage.out

Writing Tests

Basic test structure:

func TestFunctionName(t *testing.T) {
    // Arrange
    setupTestData()

    // Act
    result, err := functionUnderTest(input)

    // Assert
    assert.NoError(t, err)
    assert.Equal(t, expectedResult, result)
}

Table-driven tests:

func TestFunctionName(t *testing.T) {
    testCases := []struct {
        name     string
        input    TestInput
        expected TestOutput
    }{
        {"case1", input1, expected1},
        {"case2", input2, expected2},
    }

    for _, tc := range testCases {
        t.Run(tc.name, func(t *testing.T) {
            result := functionUnderTest(tc.input)
            assert.Equal(t, tc.expected, result)
        })
    }
}

API Reference

Core Interfaces

Provider Interface

type Provider interface {
    SendMessages(ctx context.Context, messages []message.Message, tools []tools.BaseTool) (*ProviderResponse, error)
    StreamResponse(ctx context.Context, messages []message.Message, tools []tools.BaseTool) <-chan ProviderEvent
    Model() models.Model
}

ModelVerifier Interface

type ModelVerifier interface {
    VerifyModel(ctx context.Context, model models.Model) (*VerificationResult, error)
    VerifyMultipleModels(ctx context.Context, models []models.Model) ([]VerificationResult, error)
    GetVerificationHistory(modelID string) ([]VerificationResult, error)
}

Configuration Structures

ExportOptions

type ExportOptions struct {
    IncludeAPIKey    bool     // Include API keys in export
    MinScore         float64  // Minimum score threshold
    Providers        []string // Specific providers to include
    OutputFormat     string   // Export format
    Compression      bool     // Compress output
}

VerificationResult

type VerificationResult struct {
    ModelInfo         ModelInfo       `json:"model_info"`
    PerformanceScores PerformanceScore `json:"performance_scores"`
    Error             string           `json:"error,omitempty"`
    Timestamp         time.Time        `json:"timestamp"`
}

Error Handling

LLM Verifier uses structured error handling:

// Custom error types
type VerificationError struct {
    ModelID   string
    Provider  string
    ErrorType string
    Message   string
}

func (e *VerificationError) Error() string {
    return fmt.Sprintf("[%s] %s: %s", e.Provider, e.ModelID, e.Message)
}

// Error wrapping
if err := verifyModel(model); err != nil {
    return fmt.Errorf("model verification failed for %s: %w", model.ID, err)
}

Performance Optimization

Profiling

CPU profiling:

go test -cpuprofile=cpu.prof -bench=.
go tool pprof cpu.prof

Memory profiling:

go test -memprofile=mem.prof -bench=.
go tool pprof mem.prof

Optimization Techniques

Connection Pooling: Reuse HTTP connections
Request Batching: Group multiple requests
Caching: Cache verification results
Parallel Processing: Concurrent verification
Resource Limits: Control memory and CPU usage

Benchmarking

Performance benchmarks:

func BenchmarkVerification(b *testing.B) {
    for i := 0; i < b.N; i++ {
        verifyModel(testModel)
    }
}

Load testing:

func BenchmarkConcurrentVerification(b *testing.B) {
    // Test concurrent model verification
    sem := make(chan struct{}, 10) // Limit concurrency
    // ... benchmark implementation
}

Security Implementation

Input Validation

All inputs are validated and sanitized:

func validateInput(input, inputType string) bool {
    switch inputType {
    case "model_id":
        return validateModelID(input)
    case "api_key":
        return validateAPIKey(input)
    case "endpoint":
        return validateEndpoint(input)
    default:
        return false
    }
}

Secret Management

API keys and sensitive data are handled securely:

// Mask sensitive data in logs
func maskAPIKey(apiKey string) string {
    if len(apiKey) <= 8 {
        return "***"
    }
    return apiKey[:4] + "***" + apiKey[len(apiKey)-4:]
}

// Secure configuration storage
func saveSecureConfig(config map[string]interface{}, filePath string) error {
    // Encrypt sensitive fields
    encrypted := encryptSensitiveFields(config)

    // Save with restrictive permissions
    return saveWithPermissions(encrypted, filePath, 0600)
}

Rate Limiting

Prevent API abuse:

type RateLimiter struct {
    requests map[string]*time.Ticker
    limits   map[string]int
}

func (rl *RateLimiter) Allow(provider string) bool {
    limit := rl.limits[provider]
    // Rate limiting logic
}

Deployment and Operations

Container Deployment

Dockerfile:

FROM golang:1.21-alpine AS builder
WORKDIR /app
COPY . .
RUN go build -o llm-verifier ./cmd/main.go

FROM alpine:latest
RUN apk --no-cache add ca-certificates
COPY --from=builder /app/llm-verifier /usr/local/bin/
EXPOSE 8080
CMD ["llm-verifier", "serve"]

Docker Compose:

version: '3.8'
services:
  llm-verifier:
    build: .
    ports:
      - "8080:8080"
    environment:
      - LLM_VERIFIER_DATABASE_URL=postgres://...
    volumes:
      - ./config:/app/config

Kubernetes Deployment

Deployment manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: llm-verifier
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: llm-verifier
        image: your-org/llm-verifier:latest
        ports:
        - containerPort: 8080
        env:
        - name: LLM_VERIFIER_DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: llm-verifier-secrets
              key: database-url

Monitoring and Observability

Metrics collection:

// Prometheus metrics
var (
    verificationDuration = prometheus.NewHistogramVec(
        prometheus.HistogramOpts{
            Name: "llm_verifier_duration_seconds",
            Help: "Time taken for model verification",
        },
        []string{"provider", "model"},
    )

    verificationErrors = prometheus.NewCounterVec(
        prometheus.CounterOpts{
            Name: "llm_verifier_errors_total",
            Help: "Total number of verification errors",
        },
        []string{"provider", "error_type"},
    )
)

Logging:

// Structured logging
logger := logrus.New()
logger.SetFormatter(&logrus.JSONFormatter{})
logger.WithFields(logrus.Fields{
    "provider": providerName,
    "model": modelID,
    "duration": duration,
}).Info("Model verification completed")

Contributing Guidelines

Code Standards

Go Style: Follow standard Go formatting (gofmt)
Documentation: Document all public APIs
Testing: 90%+ test coverage for new code
Error Handling: Use error wrapping and structured errors
Logging: Use structured logging with appropriate levels

Commit Messages

feat: add support for new AI provider
fix: resolve ProviderInitError in OpenCode configs
docs: update user manual with troubleshooting guide
test: add comprehensive integration tests
refactor: optimize model verification performance

Pull Request Process

Branch naming: feature/description or fix/issue-number
Tests: All tests pass, new tests added
Documentation: Updated if needed
Review: At least one maintainer review
Merge: Squash merge with descriptive commit message

Issue Reporting

Bug reports should include:

Go version and OS
Full error messages and stack traces
Steps to reproduce
Expected vs. actual behavior
Configuration files (with sensitive data removed)

Troubleshooting Development Issues

Common Development Problems

1. Build Failures

# Clean and rebuild
go clean -cache
go mod tidy
go build ./...

2. Test Failures

# Run tests with verbose output
go test -v -run TestFailingTest

# Debug with race detector
go test -race -run TestFailingTest

3. Dependency Issues

# Update dependencies
go get -u ./...

# Clean module cache
go clean -modcache

4. Performance Issues

# Profile application
go tool pprof http://localhost:8080/debug/pprof/profile

Roadmap and Future Development

Planned Features

Q4 2024: Advanced model comparison tools
Q1 2025: Real-time performance monitoring dashboard
Q2 2025: Custom verification test frameworks
Q3 2025: Multi-cloud provider optimization
Q4 2025: AI-powered test case generation

Technology Evolution

Go 1.22+ Migration: Utilize new language features
Performance Optimizations: Further reduce latency
Security Enhancements: Advanced threat detection
Scalability Improvements: Support for 1000+ concurrent verifications

This developer guide provides comprehensive information for contributing to and extending LLM Verifier. For user-facing documentation, see the User Manual.

FilesExpand file tree

DEVELOPER_GUIDE.md

Latest commit

History

DEVELOPER_GUIDE.md

File metadata and controls

LLM Verifier Developer Guide

Architecture Overview

Core Components

Key Design Patterns

Getting Started with Development

Prerequisites

Development Setup

Development Workflow

Core Concepts

Verification Process

Scoring Algorithm

Provider Architecture

Adding New Providers

Step 1: Define Provider Structure

Step 2: Implement Provider Interface

Step 3: Add to Provider Factory

Step 4: Update Provider Detection

Step 5: Add Tests

Step 6: Update Documentation

Testing Strategy

Test Types

Test Organization

Running Tests

Writing Tests

API Reference

Core Interfaces

Provider Interface

ModelVerifier Interface

Configuration Structures

ExportOptions

VerificationResult

Error Handling

Performance Optimization

Profiling

Optimization Techniques

Benchmarking

Security Implementation

Input Validation

Secret Management

Rate Limiting

Deployment and Operations

Container Deployment

Kubernetes Deployment

Monitoring and Observability

Contributing Guidelines

Code Standards

Commit Messages

Pull Request Process

Issue Reporting

Troubleshooting Development Issues

Common Development Problems

1. Build Failures

2. Test Failures

3. Dependency Issues

4. Performance Issues

Roadmap and Future Development

Planned Features

Technology Evolution