The definitive enterprise-grade playbook for building, deploying, and maintaining custom OpenMetadata connectors with comprehensive RBAC, security, and governance frameworks.
This playbook provides a complete blueprint for creating any OpenMetadata connector, from simple file-based connectors to complex enterprise data source integrations. Built from production-tested enterprise implementations, it includes everything needed for enterprise deployment.
**β
Universal Framework: Adaptable template for any data source (databases, APIs, file systems, cloud services)
**β
18+ Parser Templates: Extensible parsing architecture for all major data formats
**β
Enterprise Security: Comprehensive RBAC, multi-factor authentication, and compliance frameworks
**β
Manual Ingestion: Complete UI-bypass workflow with automated RBAC/IAM validation
**β
Hot Deployment: Zero-downtime installation into existing OpenMetadata containers
**β
Security Validation: Automated RBAC testing and security compliance verification
- π― Universal Connector Framework
- β¨ Key Features & Capabilities
- π Quick Start (Any Connector in 10 Minutes)
- π§ Connector Development Guide
- π Universal Security & RBAC
- π Implementation Patterns
- ποΈ Production Deployment
- π Complete Documentation Index
- π§ͺ Testing & Validation Framework
- π€ Contributing
graph TB
subgraph "ποΈ Data Sources (Any)"
DB[(ποΈ Databases<br/>MySQL, PostgreSQL, Oracle)]
API[π REST APIs<br/>Salesforce, ServiceNow]
Files[π File Systems<br/>HDFS, NFS, FTP, Cloud Storage]
Stream[π Streaming<br/>Kafka, Kinesis, Pulsar]
Cloud[βοΈ Cloud Services<br/>BigQuery, Snowflake, Azure]
end
subgraph "π Universal Security Layer"
Auth[π Authentication<br/>OAuth, SAML, JWT, API Keys]
RBAC[π₯ RBAC Engine<br/>Dynamic Roles & Permissions]
Audit[π Audit Trail<br/>Compliance & Governance]
Encrypt[π Encryption<br/>TLS, Field-level, At-rest]
end
subgraph "π§ Universal Connector Engine"
Discovery[π Source Discovery<br/>Schema Detection]
Parsing[π§© Format Parsing<br/>18+ Format Support]
Schema[π Schema Inference<br/>Type Detection & Mapping]
Metadata[π·οΈ Metadata Extraction<br/>Lineage, Tags, Quality]
Transform[βοΈ Data Transformation<br/>Normalization & Enrichment]
end
subgraph "ποΈ OpenMetadata Platform"
API[π Metadata API<br/>REST & GraphQL]
Catalog[π Data Catalog<br/>Searchable Metadata]
Lineage[π Data Lineage<br/>End-to-end Tracking]
Quality[β
Data Quality<br/>Profiling & Validation]
UI[π₯οΈ Web Interface<br/>User Experience]
end
DB --> Discovery
API --> Discovery
Files --> Discovery
Stream --> Discovery
Cloud --> Discovery
Auth --> RBAC
RBAC --> Audit
Audit --> Encrypt
Discovery --> Parsing
Parsing --> Schema
Schema --> Metadata
Metadata --> Transform
Transform --> API
RBAC --> API
API --> Catalog
API --> Lineage
API --> Quality
API --> UI
style DB fill:#ff9999
style API fill:#99ccff
style Files fill:#ffcc99
style RBAC fill:#99ff99
style Discovery fill:#ff99ff
style API fill:#ccff99
| Connector Category | Examples | Implementation Complexity | Timeline |
|---|---|---|---|
| File-based | HDFS, NFS, FTP, Local Files, Cloud Storage | ββ Basic | 1-2 weeks |
| Database | MySQL, PostgreSQL, Oracle, MongoDB | βββ Moderate | 2-4 weeks |
| Cloud Services | BigQuery, Snowflake, Azure Data Lake | ββββ Complex | 4-6 weeks |
| API-based | Salesforce, ServiceNow, REST APIs | βββ Moderate | 2-4 weeks |
| Streaming | Kafka, Kinesis, Pulsar, Event Hubs | βββββ Advanced | 6-8 weeks |
| Enterprise | SAP, SharePoint, Tableau, Power BI | βββββ Advanced | 8-12 weeks |
- Modular Architecture: Pluggable components for any data source type
- 18+ Format Parsers: CSV, JSON, Parquet, Avro, ORC, Excel, Delta Lake, and more
- Schema Auto-Detection: Intelligent schema inference with data type mapping
- Hierarchical Organization: Multi-level structure mapping for complex data sources
- 8 Authentication Methods: JWT, OAuth 2.0, OIDC, SAML, LDAP, IAM Roles, Certificates, mTLS
- Advanced RBAC: Team-based, domain-specific, and dynamic role assignment
- Compliance Ready: GDPR, SOX, HIPAA, PCI-DSS compliance frameworks
- Zero-Trust Architecture: Comprehensive security validation and audit trails
- High Performance: Parallel processing with configurable worker threads
- Scalable Architecture: Kubernetes-native with service mesh support
- Manual Ingestion: Complete UI-bypass workflow with automated security validation
- Hot Deployment: Zero-downtime installation in existing Docker containers
- Security Testing: Automated scripts for RBAC, IAM, and compliance verification
- Comprehensive Monitoring: Real-time alerting, behavior analytics, threat detection
- Auto-Tagging: Rule-based tagging for classification and compliance
- Data Quality: Profiling, validation, and quality metrics
- Privacy Protection: PII detection, data masking, and right-to-be-forgotten
- Audit & Compliance: Immutable audit trails and regulatory reporting
git clone https://github.com/your-org/openmetadata-connector-playbook.git universal-connector
cd universal-connector
git checkout universal-playbook-clean# Copy the appropriate template
cp templates/connectors/file-based-connector.py connectors/my_connector/connector.py
# OR
cp templates/connectors/database-connector.py connectors/my_connector/connector.py
# OR
cp templates/connectors/api-connector.py connectors/my_connector/connector.py# config/my-connector-config.yaml
source:
type: custom-connector
serviceName: "my-data-source-connector"
serviceConnection:
config:
type: CustomDatabase
sourcePythonClass: connectors.my_connector.connector.MyConnectorSource
connectionOptions:
# Your connector-specific settings
host: "${DATA_SOURCE_HOST}"
credentials: "${DATA_SOURCE_CREDENTIALS}"
workflowConfig:
openMetadataServerConfig:
hostPort: "http://localhost:8585/api"
authProvider: "openmetadata"
securityConfig:
jwtToken: "${OPENMETADATA_JWT_TOKEN}"pip install -r requirements.txt
pip install -e .
# Test your connector
export PYTHONPATH=$(pwd)
metadata ingest -c config/my-connector-config.yaml --dry-runVisit your OpenMetadata instance to see the ingested metadata!
β‘οΈ Complete Development Guide: π§ Connector Development Guide
flowchart TD
Start[π― Define Data Source] --> Choose{Choose Connector Type}
Choose -->|File System| FileTemplate[π File-based Template]
Choose -->|Database| DBTemplate[ποΈ Database Template]
Choose -->|API| APITemplate[π API Template]
Choose -->|Streaming| StreamTemplate[π Streaming Template]
FileTemplate --> Customize[βοΈ Customize Template]
DBTemplate --> Customize
APITemplate --> Customize
StreamTemplate --> Customize
Customize --> Implement[π» Implement Methods]
Implement --> Configure[π Configure Settings]
Configure --> Test[π§ͺ Test Locally]
Test --> Security[π Security Setup]
Security --> Deploy[π Deploy]
Deploy --> Validate[β
Validate Production]
style Start fill:#e8f5e8
style Test fill:#fff3e0
style Security fill:#ffebee
style Deploy fill:#e3f2fd
style Validate fill:#e1f5fe
1. Choose Your Template
# File-based connectors (HDFS, NFS, FTP, cloud storage)
cp templates/connectors/file-based-connector.py connectors/my_connector/
# Database connectors (MySQL, PostgreSQL, etc.)
cp templates/connectors/database-connector.py connectors/my_connector/
# API-based connectors (REST APIs, SaaS platforms)
cp templates/connectors/api-connector.py connectors/my_connector/
# Streaming connectors (Kafka, Kinesis, etc.)
cp templates/connectors/streaming-connector.py connectors/my_connector/2. Implement Core Methods
class MyConnectorSource(CommonConnectorSource):
def __init__(self, config: dict, metadata_config: dict):
super().__init__(config, metadata_config)
# Initialize your connector-specific settings
def prepare(self):
"""Initialize connection to your data source"""
pass
def get_database_names(self) -> List[str]:
"""Return list of databases/schemas/containers"""
pass
def get_table_names_and_types(self) -> List[Tuple[str, str]]:
"""Return list of tables/files with their types"""
pass
def get_table_metadata(self, table_name: str) -> Dict:
"""Extract metadata for a specific table/file"""
pass1. Choose Format Parsers
from connectors.parsers.factory import ParserFactory
# In your connector
def get_supported_formats(self):
return ['csv', 'json', 'parquet', 'avro', 'excel'] # Your formats
def parse_data_source(self, source_path: str):
parser = ParserFactory.get_parser(source_path)
return parser.extract_metadata()2. Add Custom Parsers (if needed)
# connectors/my_connector/parsers/custom_parser.py
from connectors.parsers.base_parser import BaseParser
class MyCustomParser(BaseParser):
def get_schema(self) -> Dict:
"""Extract schema from your custom format"""
pass
def get_sample_data(self) -> List[Dict]:
"""Get sample data for profiling"""
pass1. Implement Authentication
from connectors.security import SecurityManager
class MyConnectorSecurity(SecurityManager):
def authenticate(self):
"""Implement your authentication logic"""
pass
def validate_permissions(self):
"""Check RBAC permissions"""
pass
def get_audit_info(self):
"""Return audit information"""
pass2. Configure RBAC
# config/my-connector-rbac.yaml
rbacConfig:
enableRBAC: true
roles:
- name: "MyConnectorReader"
permissions: ["read"]
- name: "MyConnectorAdmin"
permissions: ["read", "write", "admin"]
securityConfig:
auditLevel: "comprehensive"
enablePIIDetection: true
complianceFramework: ["GDPR", "SOX"]1. Unit Testing
# Test your connector components
python -m pytest tests/test_my_connector.py -v
# Test parsers
python -m pytest tests/test_my_parsers.py -v
# Test security
python -m pytest tests/test_my_security.py -v2. Integration Testing
# Test connectivity
./scripts/test-connector-connection.sh my-connector
# Test RBAC
./scripts/test-connector-rbac.sh my-connector
# Test manual ingestion
./scripts/run-manual-ingestion.sh config/my-connector-config.yaml1. Hot Deployment
# Deploy to existing OpenMetadata
./deployment/docker-hotdeploy/deploy-connector.sh my-connector
# Verify deployment
./deployment/docker-hotdeploy/health-check.sh my-connector2. Production Configuration
# config/my-connector-production.yaml
source:
type: my-custom-connector
serviceName: "production-my-connector"
serviceConnection:
config:
# Production settings with security
enableRBAC: true
auditLevel: "comprehensive"
performanceOptimized: true
maxWorkerThreads: 10graph TB
subgraph "π‘οΈ Multi-Layer Security Framework"
Network[π Network Security<br/>VPC, Firewalls, mTLS]
Auth[π Authentication<br/>Multi-factor, SSO, Certificates]
AuthZ[π₯ Authorization<br/>RBAC, ABAC, Dynamic Policies]
Audit[π Audit & Compliance<br/>Immutable Logs, Reporting]
end
subgraph "π Universal Authentication"
OAuth[π OAuth 2.0 / OIDC]
SAML[π’ SAML SSO]
JWT[π« JWT Tokens]
IAM[π Cloud IAM Services]
LDAP[π LDAP / Active Directory]
API[π API Keys / Service Accounts]
mTLS[π Mutual TLS]
Custom[βοΈ Custom Authentication]
end
subgraph "π₯ Universal RBAC"
DataSteward[π¨βπΌ Data Steward]
DataAnalyst[π Data Analyst]
DataEngineer[βοΈ Data Engineer]
SystemAdmin[π System Admin]
ReadOnly[ποΈ Read-Only User]
Compliance[βοΈ Compliance Officer]
end
Network --> Auth
Auth --> AuthZ
AuthZ --> Audit
OAuth --> DataSteward
SAML --> DataAnalyst
JWT --> DataEngineer
IAM --> SystemAdmin
LDAP --> ReadOnly
API --> Compliance
style Network fill:#ffebee
style Auth fill:#e8f5e8
style AuthZ fill:#fff3e0
style Audit fill:#e3f2fd
# templates/security/auth_template.py
class UniversalAuthenticator:
def __init__(self, auth_config: dict):
self.auth_type = auth_config.get('type')
self.config = auth_config
def authenticate(self) -> bool:
"""Universal authentication method"""
if self.auth_type == 'oauth':
return self._oauth_auth()
elif self.auth_type == 'saml':
return self._saml_auth()
elif self.auth_type == 'jwt':
return self._jwt_auth()
elif self.auth_type == 'iam':
return self._iam_auth()
# Add more authentication methods
def _oauth_auth(self) -> bool:
"""OAuth 2.0 / OIDC authentication"""
pass
def _saml_auth(self) -> bool:
"""SAML SSO authentication"""
pass# templates/security/rbac_template.py
class UniversalRBAC:
def __init__(self, rbac_config: dict):
self.roles = rbac_config.get('roles', [])
self.policies = rbac_config.get('policies', {})
def check_permission(self, user: str, resource: str, action: str) -> bool:
"""Check if user has permission for action on resource"""
user_roles = self.get_user_roles(user)
for role in user_roles:
if self.role_has_permission(role, resource, action):
return True
return False
def validate_data_access(self, user: str, data_source: str) -> bool:
"""Validate user access to specific data source"""
pass# config/templates/basic-security.yaml
securityConfig:
authentication:
type: "jwt"
jwtConfig:
tokenUrl: "https://auth.company.com/token"
audience: "openmetadata"
rbac:
enabled: true
defaultRole: "read-only"
roles:
- name: "connector-admin"
permissions: ["read", "write", "admin"]
- name: "data-reader"
permissions: ["read"]
audit:
enabled: true
level: "INFO"
retention: "90d"# config/templates/enterprise-security.yaml
securityConfig:
authentication:
type: "saml"
samlConfig:
idpUrl: "https://sso.company.com/saml"
certificate: "/path/to/cert.pem"
rbac:
enabled: true
dynamicRoles: true
crossAccountAccess: true
compliance:
frameworks: ["GDPR", "SOX", "HIPAA"]
piiDetection: true
dataClassification: "automatic"
auditRetention: "7y"
encryption:
inTransit: "TLS1.3"
atRest: "AES256"
fieldLevel: true# Pattern for HDFS, NFS, FTP, local files, cloud storage, etc.
class FileBasedConnector(CommonConnectorSource):
def get_files_list(self) -> List[str]:
"""Get list of files from source"""
pass
def process_file(self, file_path: str) -> Dict:
"""Process individual file"""
parser = ParserFactory.get_parser(file_path)
return parser.extract_metadata()
def get_partitions(self, file_path: str) -> List[str]:
"""Extract partition information if applicable"""
pass# Pattern for MySQL, PostgreSQL, etc.
class DatabaseConnector(CommonConnectorSource):
def get_connection(self):
"""Establish database connection"""
pass
def get_schemas(self) -> List[str]:
"""Get list of database schemas"""
pass
def get_tables(self, schema: str) -> List[str]:
"""Get tables in schema"""
pass
def get_table_schema(self, table: str) -> Dict:
"""Extract table schema"""
pass# Pattern for REST APIs, SaaS platforms
class APIConnector(CommonConnectorSource):
def authenticate_api(self):
"""API authentication"""
pass
def get_endpoints(self) -> List[str]:
"""Get available API endpoints"""
pass
def fetch_data(self, endpoint: str) -> Dict:
"""Fetch data from API endpoint"""
pass
def paginate_results(self, endpoint: str) -> Iterator[Dict]:
"""Handle API pagination"""
pass# Pattern for Kafka, Kinesis, etc.
class StreamingConnector(CommonConnectorSource):
def get_topics(self) -> List[str]:
"""Get available topics/streams"""
pass
def get_schema_registry(self) -> Dict:
"""Get schema information"""
pass
def sample_messages(self, topic: str) -> List[Dict]:
"""Sample messages for metadata extraction"""
pass# config/templates/environment-config.yaml
development:
source:
connectionOptions:
host: "dev-server.company.com"
enableDebug: true
staging:
source:
connectionOptions:
host: "staging-server.company.com"
enableAudit: true
production:
source:
connectionOptions:
host: "prod-server.company.com"
enableRBAC: true
auditLevel: "comprehensive"# config/templates/multi-source-config.yaml
sources:
- name: "primary-source"
type: "database"
config:
host: "primary-db.company.com"
- name: "secondary-source"
type: "file-system"
config:
path: "/data/files"
- name: "api-source"
type: "api"
config:
baseUrl: "https://api.company.com"graph TD
subgraph "π Multi-Environment Pipeline"
Dev[π§ͺ Development<br/>Local Testing]
Staging[π Staging<br/>Integration Testing]
Prod[π Production<br/>Live Deployment]
end
subgraph "π³ Container Orchestration"
Docker[π Docker Containers]
K8s[βΈοΈ Kubernetes]
Helm[β΅ Helm Charts]
Compose[π Docker Compose]
end
subgraph "βοΈ Cloud Platforms"
CloudA[βοΈ Cloud Platform A]
CloudB[π· Cloud Platform B]
CloudC[π Cloud Platform C]
OnPrem[π’ On-Premises]
end
subgraph "π Monitoring & Observability"
Prometheus[π Prometheus]
Grafana[π Grafana]
Jaeger[π Jaeger Tracing]
Logs[π Centralized Logging]
end
Dev --> Staging
Staging --> Prod
Docker --> K8s
K8s --> Helm
Compose --> Docker
CloudA --> Prometheus
CloudB --> Grafana
CloudC --> Jaeger
OnPrem --> Logs
style Dev fill:#e8f5e8
style Staging fill:#fff3e0
style Prod fill:#ffebee
style K8s fill:#e3f2fd
# templates/deployment/Dockerfile.template
FROM openmetadata/ingestion:1.8.1
# Copy your connector
COPY connectors/ /opt/openmetadata/connectors/
COPY config/ /opt/openmetadata/config/
# Install dependencies
COPY requirements.txt /tmp/
RUN pip install -r /tmp/requirements.txt
# Set up connector
WORKDIR /opt/openmetadata
RUN pip install -e .
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD python -c "import connectors.my_connector.connector; print('OK')"
ENTRYPOINT ["metadata", "ingest"]# templates/deployment/k8s-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-connector-ingestion
labels:
app: my-connector
spec:
replicas: 3
selector:
matchLabels:
app: my-connector
template:
metadata:
labels:
app: my-connector
spec:
containers:
- name: connector
image: my-connector:latest
env:
- name: CONNECTOR_CONFIG
valueFrom:
configMapKeyRef:
name: connector-config
key: config.yaml
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "1000m"
livenessProbe:
exec:
command:
- python
- -c
- "import connectors.my_connector.connector"
initialDelaySeconds: 30
periodSeconds: 10# templates/deployment/helm/values.yaml
replicaCount: 3
image:
repository: my-connector
tag: latest
pullPolicy: IfNotPresent
service:
type: ClusterIP
port: 8080
ingress:
enabled: true
annotations:
kubernetes.io/ingress.class: nginx
hosts:
- host: my-connector.company.com
paths: ["/"]
resources:
requests:
memory: 512Mi
cpu: 250m
limits:
memory: 2Gi
cpu: 1000m
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilizationPercentage: 80| Connector Type | Getting Started | Advanced Features | Production Guide |
|---|---|---|---|
| π File-based | File Connector Guide | Advanced Parsing | File Connector Production |
| ποΈ Database | Database Connector Guide | Schema Management | Database Production |
| π API-based | API Connector Guide | Rate Limiting | API Production |
| π Streaming | Streaming Guide | Real-time Processing | Streaming Production |
graph TD
Root[π Universal Playbook] --> Templates[π Templates]
Root --> Guides[π Implementation Guides]
Root --> Reference[π Reference]
Root --> Examples[π‘ Examples]
Templates --> ConnectorTemplates[π§ Connector Templates]
Templates --> SecurityTemplates[π Security Templates]
Templates --> DeploymentTemplates[π Deployment Templates]
Templates --> ConfigTemplates[βοΈ Config Templates]
Guides --> GettingStarted[π Getting Started]
Guides --> Development[π» Development Guide]
Guides --> Security[π Security Guide]
Guides --> Production[π Production Guide]
Reference --> API[π API Reference]
Reference --> Patterns[π§© Design Patterns]
Reference --> BestPractices[β
Best Practices]
Reference --> Troubleshooting[π§ Troubleshooting]
Examples --> FileExample[π File Connector Example]
Examples --> DBExample[ποΈ Database Example]
Examples --> APIExample[π API Example]
Examples --> StreamExample[π Streaming Example]
style Root fill:#ffebcd
style Templates fill:#e6f3ff
style Guides fill:#fff2e6
style Reference fill:#f0e6ff
style Examples fill:#e6ffe6
- π§ Connector Templates - Ready-to-use connector templates
- π Security Templates - Authentication and RBAC templates
- π Deployment Templates - Docker, Kubernetes, and cloud deployment
- βοΈ Configuration Templates - Environment and multi-source configurations
- π Quick Start Guide - Any connector in 10 minutes
- π» Development Guide - Step-by-step implementation
- π Security Guide - Universal security implementation
- π Production Guide - Enterprise deployment best practices
- π API Reference - Complete API documentation
- π§© Design Patterns - Common implementation patterns
- β Best Practices - Industry best practices
- π§ Troubleshooting - Common issues and solutions
- π File Connector Example - Complete HDFS/NFS implementation
- ποΈ Database Example - MySQL/PostgreSQL connector
- π API Example - REST API connector implementation
- π Streaming Example - Kafka/Kinesis connector
graph TB
subgraph "π§ͺ Testing Layers"
Unit[βοΈ Unit Tests<br/>Component Testing]
Integration[π Integration Tests<br/>End-to-end Testing]
Security[π Security Tests<br/>RBAC & Auth Testing]
Performance[β‘ Performance Tests<br/>Load & Stress Testing]
Compliance[βοΈ Compliance Tests<br/>Regulatory Validation]
end
subgraph "π― Test Categories"
Connector[π§ Connector Logic]
Parser[π§© Parser Testing]
Auth[π Authentication]
RBAC[π₯ RBAC Validation]
Deployment[π Deployment Tests]
end
subgraph "π€ Automation"
CI[π CI/CD Pipeline]
Scheduled[β° Scheduled Tests]
Monitoring[π Continuous Monitoring]
Alerts[π¨ Automated Alerts]
end
Unit --> Connector
Integration --> Parser
Security --> Auth
Performance --> RBAC
Compliance --> Deployment
Connector --> CI
Parser --> Scheduled
Auth --> Monitoring
RBAC --> Alerts
# templates/tests/test_connector_template.py
import pytest
from connectors.my_connector.connector import MyConnectorSource
class TestMyConnector:
def setup_method(self):
self.config = {
'host': 'test-server',
'credentials': 'test-creds'
}
self.connector = MyConnectorSource(self.config, {})
def test_connection(self):
"""Test basic connectivity"""
assert self.connector.test_connection() == True
def test_get_databases(self):
"""Test database enumeration"""
databases = self.connector.get_database_names()
assert isinstance(databases, list)
assert len(databases) > 0
def test_get_tables(self):
"""Test table enumeration"""
tables = self.connector.get_table_names_and_types()
assert isinstance(tables, list)
def test_get_metadata(self):
"""Test metadata extraction"""
metadata = self.connector.get_table_metadata('test_table')
assert 'schema' in metadata
assert 'columns' in metadata# templates/tests/test_security_template.py
import pytest
from connectors.security import UniversalAuthenticator, UniversalRBAC
class TestSecurity:
def test_authentication(self):
"""Test authentication methods"""
auth_config = {'type': 'jwt', 'token': 'test-token'}
authenticator = UniversalAuthenticator(auth_config)
assert authenticator.authenticate() == True
def test_rbac_permissions(self):
"""Test RBAC permission checking"""
rbac_config = {
'roles': [
{'name': 'admin', 'permissions': ['read', 'write']}
]
}
rbac = UniversalRBAC(rbac_config)
assert rbac.check_permission('admin', 'resource', 'read') == True
def test_audit_logging(self):
"""Test audit trail generation"""
# Test audit log generation
pass#!/bin/bash
# templates/tests/integration_test_template.sh
echo "π§ͺ Running Integration Tests for My Connector"
# Test 1: Connectivity
echo "Testing connectivity..."
./scripts/test-connector-connection.sh my-connector
if [ $? -ne 0 ]; then
echo "β Connectivity test failed"
exit 1
fi
# Test 2: RBAC
echo "Testing RBAC..."
./scripts/test-connector-rbac.sh my-connector
if [ $? -ne 0 ]; then
echo "β RBAC test failed"
exit 1
fi
# Test 3: Manual Ingestion
echo "Testing manual ingestion..."
./scripts/run-manual-ingestion.sh config/my-connector-test.yaml --dry-run
if [ $? -ne 0 ]; then
echo "β Manual ingestion test failed"
exit 1
fi
echo "β
All integration tests passed!"# templates/ci-cd/github-actions.yml
name: Universal Connector Tests
on:
push:
branches: [ main, develop ]
pull_request:
branches: [ main ]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: [3.8, 3.9, '3.10', 3.11]
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v3
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install -e .
- name: Run unit tests
run: |
python -m pytest tests/unit/ -v --cov=connectors
- name: Run integration tests
run: |
./tests/integration/run_all_tests.sh
- name: Run security tests
run: |
python -m pytest tests/security/ -v
- name: Upload coverage
uses: codecov/codecov-action@v3
security-scan:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Run security scan
run: |
pip install bandit safety
bandit -r connectors/
safety checkgraph LR
Fork[π΄ Fork Playbook] --> Choose{Choose Contribution Type}
Choose -->|New Connector| ConnectorDev[π§ Develop Connector]
Choose -->|Parser| ParserDev[π§© Develop Parser]
Choose -->|Security| SecurityDev[π Security Enhancement]
Choose -->|Documentation| DocsDev[π Documentation]
ConnectorDev --> Test[π§ͺ Test Implementation]
ParserDev --> Test
SecurityDev --> Test
DocsDev --> Review[π Review Content]
Test --> PR[π€ Create Pull Request]
Review --> PR
PR --> CodeReview[π Code Review]
CodeReview --> Merge[π Merge to Main]
style Fork fill:#e8f5e8
style Test fill:#fff3e0
style PR fill:#e3f2fd
style Merge fill:#e1f5fe
- Implement new connector using provided templates
- Follow established patterns and conventions
- Include comprehensive tests and documentation
- Submit with example configurations
- Add support for new data formats
- Extend existing parsers with new features
- Optimize parsing performance
- Include format-specific tests
- Add new authentication methods
- Enhance RBAC capabilities
- Improve compliance features
- Contribute security best practices
- Improve existing documentation
- Add new implementation guides
- Create tutorial content
- Enhance troubleshooting guides
πΏ Development Workflow
# 1. Fork and clone
git clone https://github.com/yourusername/openmetadata-connector-playbook.git
cd openmetadata-connector-playbook
git checkout universal-playbook-clean
# 2. Create feature branch
git checkout -b feature/my-new-connector
# 3. Develop using templates
cp templates/connectors/file-based-connector.py connectors/my_connector/
# Implement your connector...
# 4. Test thoroughly
python -m pytest tests/ -v
./scripts/test-connector-connection.sh my-connector
# 5. Document your work
# Update docs/connectors/my-connector.md
# Add examples/my-connector/
# 6. Submit pull request
git push origin feature/my-new-connectorπ Contribution Checklist
- Follows established patterns and templates
- Includes comprehensive unit tests
- Includes integration tests
- Includes security validation
- Documentation is complete and accurate
- Examples are provided
- Code follows style guidelines
- All tests pass in CI/CD pipeline
MIT License - see LICENSE file for details.
This universal playbook is open source and designed to accelerate the development of OpenMetadata connectors across the community.
- π Documentation: Comprehensive guides for every connector type
- π‘ Examples: Real-world implementations and patterns
- π§ Templates: Ready-to-use scaffolding for rapid development
- π¬ Community: Join discussions and get help from experts
- π§ Direct Support: Contact [email protected]
This universal playbook is built on production experience from enterprise implementations and contributions from the OpenMetadata community. Special thanks to all contributors who help make data integration accessible and secure.
π Ready to build your connector? Choose your connector type template and have your custom OpenMetadata connector running in production within days, not months!
π Need enterprise security? Our universal RBAC framework provides battle-tested security patterns for any data source!
π― Want to contribute? Join our contributor community and help build the future of universal data connectivity!