Purpose: These guides are designed to help Claude Code sessions navigate and use the Dataiku Python API effectively. They document common workflows, patterns, gotchas, and best practices.
Audience: Claude Code AI sessions (not human developers)
- 00-project-planning-guide.md ⭐ READ THIS FIRST!
- Why planning matters for Claude Code (prevents getting lost!)
- Creating detailed project plans BEFORE coding
- Naming conventions (UPPERCASE for Snowflake compatibility)
- Phase-by-phase implementation workflow
- Visual flow planning and dependencies
- Progress tracking and checkpoints
-
- Installation and environment setup
- API key generation and management
- Connection verification
- Common setup issues
-
02-authentication-and-connection.md
- Scope hierarchy (CRITICAL CONCEPT)
- Authentication methods
- Connection patterns
- Permission handling
-
- Creating and configuring projects
- Project metadata and variables
- Project contents and flow
- Export/import patterns
-
- Dataset CRUD operations
- Schema management
- Reading and writing data
- Building and partitioning
- Dataset metadata
-
- Recipe types and creation
- Running and monitoring recipes
- Schema updates
- Recipe dependencies
-
- Creating and configuring scenarios
- Scenario steps and triggers
- Running and monitoring
- Notifications and reporters
-
- ML task operations
- Model training and evaluation
- Saved models and versioning
- Model deployment
-
- Critical concepts to remember
- Common errors and solutions
- Best practices
- Debugging checklist
- 99-quick-reference.md
- Cheat sheet for common operations
- Code snippets
- Quick patterns
- Essential reminders
First time or starting a new project? Read in order: 0. Project Planning Guide ⭐ (00-project-planning-guide.md) - START HERE!
- Prerequisites and Setup (01-prerequisites-and-setup.md)
- Authentication and Connection (02-authentication-and-connection.md)
- Project Operations (03-project-operations.md)
- Dataset Operations (04-dataset-operations.md)
Need specific help? Jump to:
- Planning → 00-project-planning-guide.md ⭐
- Recipes → 05-recipe-workflows.md
- Automation → 06-scenario-automation.md
- ML → 07-ml-workflows.md
- Troubleshooting → 08-common-gotchas.md
- Quick lookup → 99-quick-reference.md
DSSClient (Instance Level)
↓
DSSProject (Project Level)
↓
DSSDataset / DSSRecipe / DSSScenario (Item Level)
You must go through each level! Cannot skip.
# ❌ WRONG
settings = dataset.get_settings()
settings.settings['description'] = "New"
# Changes lost!
# ✓ CORRECT
settings = dataset.get_settings()
settings.settings['description'] = "New"
settings.save() # Critical!Many operations (builds, scenarios, training) are asynchronous. You must wait for completion.
Strongly recommended: Use UPPERCASE for project keys, dataset names, and column names (especially with Snowflake):
MY_PROJECT✓ (recommended)RAW_CUSTOMERS,CLEAN_ORDERS✓ (recommended for Snowflake)- Snowflake requires uppercase table/column names
- Prevents case-sensitivity issues
See 00-project-planning-guide.md for complete naming conventions.
variables = project.get_variables()
batch_size = int(variables["batch_size"]) # Convert!from dataikuapi import DSSClient
import os
# Connect
client = DSSClient(
os.getenv('DATAIKU_HOST'),
os.getenv('DATAIKU_API_KEY')
)
# Get project
project = client.get_project("MY_PROJECT")
# Build source
source = project.get_dataset("source_data")
source.build(wait=True)
# Run transformation
recipe = project.get_recipe("transform_data")
recipe.run(wait=True)
# Verify output
output = project.get_dataset("final_output")
metadata = output.get_metadata()
print(f"✓ Output has {metadata.get('recordCount', 0)} rows")# Run scenario
scenario = project.get_scenario("daily_refresh")
scenario_run = scenario.run_and_wait()
if scenario_run.get_outcome() == 'SUCCESS':
print("✓ Pipeline succeeded")
else:
print("❌ Pipeline failed")
# Get logs, send alerts, etc.# Required
export DATAIKU_HOST="https://dss.yourcompany.com"
export DATAIKU_API_KEY="your-api-key-here"
# Multi-environment setup
export DATAIKU_DEV_HOST="https://dev-dss.company.com"
export DATAIKU_DEV_API_KEY="dev-key"
export DATAIKU_PROD_HOST="https://prod-dss.company.com"
export DATAIKU_PROD_API_KEY="prod-key"- Official API Docs: https://developer.dataiku.com/latest/api-reference/python/
- Main Dataiku Docs: https://doc.dataiku.com/
- GitHub: https://github.com/dataiku/dataiku-api-client-python
- PyPI: https://pypi.org/project/dataiku-api-client/
These guides are maintained for Claude Code sessions. When updating:
- Keep examples practical and tested
- Include gotchas and common mistakes
- Show both ❌ wrong and ✓ correct patterns
- Focus on what Claude Code needs to know
- Keep code snippets copy-pasteable
- Guide Version: 1.0
- API Version: 14.1.3+
- Last Updated: 2025-11-21
- Python: 3.7+
⚠️ Scope hierarchy - Must go through project⚠️ Save settings - Always call.save()⚠️ Use UPPERCASE naming - Especially for Snowflake:MY_PROJECT,RAW_CUSTOMERS⚠️ Variables are strings - Convert types!⚠️ Async operations - Wait for completion⚠️ Schema updates - Callcompute_schema_updates().apply()⚠️ Scenario runs - Two-step process
Happy automating! 🚀