iac

Dataiku IaC Test Suite

Comprehensive test suite for the Dataiku Infrastructure as Code (IaC) implementation.

Overview

This test suite validates all IaC components through:

Unit tests: Fast, mock-based tests for individual components
Integration tests: Real Dataiku instance validation
Scenario tests: End-to-end workflow testing
Performance tests: Scale and performance validation

Current Coverage:

Existing tests: 278 tests (98% pass rate)
New tests: 50+ additional tests
Total coverage: ~90% across all modules

Test Organization

tests/iac/
├── pytest.ini                          # Pytest configuration
├── conftest.py                         # Shared fixtures
├── README.md                           # This file
│
├── unit/                               # FAST - Mock-based tests
│   ├── config/
│   │   └── test_validation_edge_cases.py
│   ├── state/
│   ├── planner/
│   └── sync/
│
├── integration/                        # SLOW - Real Dataiku tests
│   └── test_real_dataiku_sync.py
│
├── scenarios/                          # End-to-end workflows
│   └── test_plan_workflow.py
│
├── performance/                        # Performance & scale tests
│
└── fixtures/                           # Test data
    ├── configs/
    │   ├── simple/                    # Minimal configs
    │   ├── realistic/                 # Real-world scenarios
    │   ├── complex/                   # Complex pipelines
    │   └── edge_cases/                # Invalid/edge case configs
    └── states/                        # Sample state files

Quick Start

1. Run Fast Unit Tests (Recommended for Development)

# All unit tests (mock-based, fast)
pytest -m unit

# Specific test file
pytest tests/iac/unit/config/test_validation_edge_cases.py

# With verbose output
pytest -m unit -v

Run time: ~10-30 seconds

2. Run Integration Tests (Requires Dataiku)

# Set environment variable to enable real Dataiku testing
export USE_REAL_DATAIKU=true

# Run integration tests
pytest -m integration

# Run specific integration test
pytest tests/iac/integration/test_real_dataiku_sync.py -v

Run time: ~1-5 minutes (depending on Dataiku instance)

3. Run Scenario Tests

# End-to-end workflow tests
pytest tests/iac/scenarios/

# Specific scenario
pytest tests/iac/scenarios/test_plan_workflow.py::TestSimpleProjectWorkflow

4. Run Everything

# All tests (unit + integration + scenarios)
pytest

# Skip slow tests
pytest -m "not slow"

# Only smoke tests (quick validation)
pytest -m smoke

Test Markers

Tests are tagged with markers for selective execution:

Marker	Purpose	Run Time
`unit`	Fast unit tests with mocks	Seconds
`integration`	Real Dataiku instance tests	Minutes
`slow`	Performance/scale tests	Minutes
`smoke`	Quick smoke tests for CI/CD	Seconds
`edge_case`	Edge case and error handling	Seconds
`cleanup_required`	Creates resources needing manual cleanup	Varies

Usage:

# Run only unit tests
pytest -m unit

# Run integration tests
pytest -m integration

# Run everything except slow tests
pytest -m "not slow"

# Run smoke tests only
pytest -m smoke

# Combine markers
pytest -m "unit and edge_case"

Environment Configuration

For Unit Tests (Mock-based)

No configuration needed - tests use mocks.

For Integration Tests (Real Dataiku)

Set these environment variables:

# Required
export USE_REAL_DATAIKU=true

# Optional - defaults provided
export DATAIKU_HOST="http://172.18.58.26:10000"  # Default: local instance
export TEST_PROJECT_PREFIX="IAC_TEST_"            # Default: IAC_TEST_
export TEST_PROJECT_KEY="IAC_TEST_PROJECT"        # Optional: specific project to test

Note: Running on the local box, API key is NOT required.

Running Specific Test Categories

Config Validation Tests

# All validation tests
pytest tests/iac/unit/config/

# Specific edge case tests
pytest tests/iac/unit/config/test_validation_edge_cases.py::TestNamingConventionEdgeCases

State Management Tests

# Existing state tests
pytest tests/iac/test_state.py

# State sync integration tests
pytest tests/iac/integration/test_real_dataiku_sync.py::TestStateManagerRealSync

Plan Generation Tests

# Existing planner tests
pytest tests/iac/test_planner.py

# Scenario workflow tests
pytest tests/iac/scenarios/test_plan_workflow.py

Real Dataiku Sync Tests

# All sync tests
export USE_REAL_DATAIKU=true
pytest tests/iac/integration/

# Just project sync
pytest tests/iac/integration/test_real_dataiku_sync.py::TestRealProjectSync

# Just dataset sync
pytest tests/iac/integration/test_real_dataiku_sync.py::TestRealDatasetSync

Test Fixtures

Config Fixtures

Located in fixtures/configs/:

Fixture	Purpose	Resources
`simple/project.yml`	Minimal config for smoke tests	1 project, 1 dataset
`realistic/customer_analytics.yml`	Real-world analytics pipeline	1 project, 7 datasets, 4 recipes
`complex/ml_pipeline.yml`	Complex ML workflow	1 project, 15+ datasets, 10+ recipes
`edge_cases/invalid_naming.yml`	Invalid naming conventions	Invalid config
`edge_cases/circular_dependency.yml`	Circular dependencies	Invalid config

State Fixtures

Located in fixtures/states/:

Fixture	Purpose
`empty_state.json`	Empty state for baseline tests
`simple_state.json`	State with one project and dataset

Common Test Scenarios

Test 1: Validate Config Parsing and Validation

# Test config parsing
pytest tests/iac/test_config_parser.py -v

# Test validation edge cases
pytest tests/iac/unit/config/test_validation_edge_cases.py -v

Test 2: Sync State from Real Dataiku

export USE_REAL_DATAIKU=true
export TEST_PROJECT_KEY="YOUR_PROJECT_KEY"

# Sync and validate
pytest tests/iac/integration/test_real_dataiku_sync.py::TestStateManagerRealSync::test_sync_project_with_children -v -s

Test 3: Generate Plan from Config

# Test plan generation workflow
pytest tests/iac/scenarios/test_plan_workflow.py::TestSimpleProjectWorkflow -v

Test 4: End-to-End Workflow (Config → Plan)

# Complete workflow with simple config
pytest tests/iac/scenarios/test_plan_workflow.py::TestSimpleProjectWorkflow::test_empty_to_full_plan -v -s

# Complete workflow with realistic config
pytest tests/iac/scenarios/test_plan_workflow.py::TestRealisticPipelineWorkflow -v -s

Continuous Integration

Fast CI Pipeline (Unit Tests Only)

# Run in CI - fast feedback
pytest -m unit --tb=short

Run time: ~30 seconds

Full CI Pipeline (Unit + Integration)

# Requires Dataiku instance access
export USE_REAL_DATAIKU=true
export DATAIKU_HOST="http://172.18.58.26:10000"

pytest -m "unit or integration" --tb=short

Run time: ~5 minutes

Pre-Release Validation (Everything)

# Run all tests including slow ones
pytest --tb=short

Run time: ~10-15 minutes

Debugging Failed Tests

Verbose Output

# More detailed output
pytest -v

# Show print statements
pytest -s

# Both
pytest -v -s

Specific Test

# Run single test with full output
pytest tests/iac/unit/config/test_validation_edge_cases.py::TestNamingConventionEdgeCases::test_project_key_with_lowercase_fails -v -s

Debug Mode

# Drop into debugger on failure
pytest --pdb

# Drop into debugger on first failure
pytest -x --pdb

Show Fixture Setup

# Show fixture setup/teardown
pytest --setup-show

Coverage Reports

Generate Coverage Report

# Run tests with coverage
pytest --cov=dataikuapi.iac --cov-report=html --cov-report=term-missing

# Open HTML report
open htmlcov/index.html  # macOS
xdg-open htmlcov/index.html  # Linux

Coverage by Module

# Coverage for specific module
pytest tests/iac/test_config_parser.py --cov=dataikuapi.iac.config.parser --cov-report=term-missing

Writing New Tests

Unit Test Template

import pytest
from dataikuapi.iac.config.validator import ConfigValidator

@pytest.mark.unit
class TestMyFeature:
    """Test my new feature"""

    def test_basic_functionality(self):
        """Test basic use case"""
        validator = ConfigValidator()
        # Test logic here
        assert True

    def test_edge_case(self):
        """Test edge case"""
        # Edge case logic
        assert True

Integration Test Template

import pytest

@pytest.mark.integration
@pytest.mark.slow
class TestMyIntegration:
    """Test with real Dataiku"""

    def test_with_real_instance(self, real_client, skip_if_no_real_dataiku):
        """Test against real Dataiku"""
        try:
            # Test logic using real_client
            assert True
        except Exception as e:
            pytest.skip(f"Test failed: {e}")

Troubleshooting

Issue: Integration tests skipped

Cause: USE_REAL_DATAIKU not set

Solution:

export USE_REAL_DATAIKU=true
pytest -m integration

Issue: Connection errors to Dataiku

Cause: Dataiku instance not accessible or wrong host

Solution:

# Verify host is correct
export DATAIKU_HOST="http://172.18.58.26:10000"

# Test connection
pytest tests/iac/integration/test_real_dataiku_sync.py::TestConnectionAndAuth::test_client_can_connect -v -s

Issue: Test project not found

Cause: Test project doesn't exist in Dataiku

Solution:

# Use existing project
export TEST_PROJECT_KEY="YOUR_EXISTING_PROJECT"

# Or tests will skip if project not found

Issue: Circular import errors

Cause: PYTHONPATH not set correctly

Solution:

# Run from repository root
cd /opt/dataiku/dss_install/dataiku-api-client-python
pytest

Performance Benchmarks

Expected test execution times:

Test Suite	Count	Time
Unit tests (mock-based)	~200	~30s
Integration tests (real Dataiku)	~20	~2m
Scenario tests	~15	~1m
Total	~235	~3m

Note: Times vary based on system and Dataiku instance performance.

Next Steps

Additional Tests to Add

Performance Tests
- Large config files (100+ resources)
- State file size performance
- Sync performance benchmarks
CLI Tests
- All CLI flags and combinations
- Error messages and exit codes
- Output validation
State Corruption Tests
- Recovery from corrupted state files
- Version migration
- Backup/restore
Apply Execution Tests (Week 3)
- Resource creation
- Resource updates
- Resource deletion
- Rollback on failure

Resources

Documentation: ../../docs/IAC_OVERVIEW.md
Planning Docs: ../../dataiku-iac-planning/
Demo Scripts: ../../demos/week2_plan_workflow.py
IaC Source: ../../dataikuapi/iac/

Contributing

When adding new tests:

Follow existing patterns (see templates above)
Use appropriate markers (@pytest.mark.unit, @pytest.mark.integration, etc.)
Add docstrings explaining test purpose
Use fixtures from conftest.py
Update this README if adding new test categories

Last Updated: 2025-11-26 Test Suite Version: 2.0 IaC Version: Week 2 (Plan Generation)

Name		Name	Last commit message	Last commit date
parent directory ..
fixtures		fixtures
integration		integration
scenarios		scenarios
unit/config		unit/config
workflows/discovery		workflows/discovery
INTEGRATION_TEST_RESULTS.md		INTEGRATION_TEST_RESULTS.md
README.md		README.md
SETUP.md		SETUP.md
TEST_RESULTS.md		TEST_RESULTS.md
__init__.py		__init__.py
conftest.py		conftest.py
pytest.ini		pytest.ini
run_integration_tests.sh		run_integration_tests.sh
test_backends.py		test_backends.py
test_config_parser.py		test_config_parser.py
test_config_validation.py		test_config_validation.py
test_diff.py		test_diff.py
test_integration.py		test_integration.py
test_integration_week2.py		test_integration_week2.py
test_manager.py		test_manager.py
test_plan_formatter.py		test_plan_formatter.py
test_planner.py		test_planner.py
test_resource.py		test_resource.py
test_schema_validation.py		test_schema_validation.py
test_state.py		test_state.py
test_state_builder.py		test_state_builder.py
test_sync_project.py		test_sync_project.py
test_sync_recipe.py		test_sync_recipe.py

FilesExpand file tree

iac

Directory actions

More options

Directory actions

More options

Latest commit

History

iac

Folders and files

parent directory

README.md

Dataiku IaC Test Suite

Overview

Test Organization

Quick Start

1. Run Fast Unit Tests (Recommended for Development)

2. Run Integration Tests (Requires Dataiku)

3. Run Scenario Tests

4. Run Everything

Test Markers

Environment Configuration

For Unit Tests (Mock-based)

For Integration Tests (Real Dataiku)

Running Specific Test Categories

Config Validation Tests

State Management Tests

Plan Generation Tests

Real Dataiku Sync Tests

Test Fixtures

Config Fixtures

State Fixtures

Common Test Scenarios

Test 1: Validate Config Parsing and Validation

Test 2: Sync State from Real Dataiku

Test 3: Generate Plan from Config

Test 4: End-to-End Workflow (Config → Plan)

Continuous Integration

Fast CI Pipeline (Unit Tests Only)

Full CI Pipeline (Unit + Integration)

Pre-Release Validation (Everything)

Debugging Failed Tests

Verbose Output

Specific Test

Debug Mode

Show Fixture Setup

Coverage Reports

Generate Coverage Report

Coverage by Module

Writing New Tests

Unit Test Template

Integration Test Template

Troubleshooting

Issue: Integration tests skipped

Issue: Connection errors to Dataiku

Issue: Test project not found

Issue: Circular import errors

Performance Benchmarks

Next Steps

Additional Tests to Add

Resources

Contributing