Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

default_python

The 'default_python' project was generated by using the default-python template.

  • src/: Python source code for this project.
    • src/default_python/: Shared Python code that can be used by jobs and pipelines.
  • resources/: Resource configurations (jobs, pipelines, etc.)
  • tests/: Unit tests for the shared Python code.
  • fixtures/: Fixtures for data sets (primarily used for testing).

Getting started

Choose how you want to work on this project:

(a) Directly in your Databricks workspace, see https://docs.databricks.com/dev-tools/bundles/workspace.

(b) Locally with an IDE like Cursor or VS Code, see https://docs.databricks.com/dev-tools/vscode-ext.html.

(c) With command line tools, see https://docs.databricks.com/dev-tools/cli/databricks-cli.html

If you're developing with an IDE, dependencies for this project should be installed using uv:

Using this project using the CLI

The Databricks workspace and IDE extensions provide a graphical interface for working with this project. It's also possible to interact with it directly using the CLI:

  1. Authenticate to your Databricks workspace, if you have not done so already:

    $ databricks configure
    
  2. To deploy a development copy of this project, type:

    $ databricks bundle deploy --target dev
    

    (Note that "dev" is the default target, so the --target parameter is optional here.)

    This deploys everything that's defined for this project. For example, the default template would deploy a pipeline called [dev yourname] default_python_etl to your workspace. You can find that resource by opening your workpace and clicking on Jobs & Pipelines.

  3. Similarly, to deploy a production copy, type:

    $ databricks bundle deploy --target prod
    

    Note the default template has a includes a job that runs the pipeline every day (defined in resources/sample_job.job.yml). The schedule is paused when deploying in development mode (see https://docs.databricks.com/dev-tools/bundles/deployment-modes.html).

  4. To run a job or pipeline, use the "run" command:

    $ databricks bundle run
    
  5. Finally, to run tests locally, use pytest:

    $ uv run pytest