Skip to content

Explicit dependency between actions #1248

@shykes

Description

@shykes

Overview

This proposes a design in Dagger 0.2 to solve #940.

Problem

Most of the time, dependencies between Dagger tasks are implicit, thanks to the magic of CUE references. If a config value for task foo is referenced by a config value for task bar, Dagger implicitely knows that foo must run before bar.

But sometimes, foo must run before bar even though there is no dependency between their configurations. For example, foo might upload an artifact at a known URL, which bar must later download. In this case, developers need a way to control dependencies between tasks explicitly.

Currently this is achieved by various hacks (cc @grouville who is carrying the oppressive weight of these soul-destroying hacks on his shoulders).

Solution

Instead of hacks, there should be a reliable and intuitive API to achieve this with minimal effort. There are 3 design options being considered:

description Synchronization primitive Wait for one task Wait for multiple tasks Synchronize with high-level actions Works with cue/flow
Option 1 #Wait + id Explicit calls to engine engine.#Wait & { target: foo , completed: { …}} engine.#Wait & { targets: [foo, bar] } High-level action must expose id: #ID (can be #Wait on multiple sub-tasks) Yes
Option 2 if + status CUE control flow + CUE computed values if (foo.status == “completed”) { … } if (foo.status == “completed”) && (bar.status == “completed”) { … } High-level action must expose status: #Status (can be CUE expression combining status of multiple sub-tasks) Maybe?
Option 3 Salt-style convention Engine scans the tree for special fields require: foo require: [foo, bar] High-level action must implement one or more of require, require_any, onfail, onfail_any No (requires patching cue/flow to confer special meaning to Salt-style convention)

See below for details on each option.

Option 1: #Wait + id convention

Spec:

package engine

// Wait for one or more tasks to complete
#Wait: {
  {
    target: #TaskID
  } | {
    targets: […#TaskID]
  }
  // Execute a task when all targets have completed
  completed?: #Task
  // Execute a task when all targets have started
  started?: #Task
}

// Match any engine task
#Task: {
  id: #TaskID
}

#TaskID: string

Example usage:

sleepShort: engine.#Exec & {
  args: [“sleep”, “1”]
}

sleepLong: engine.#Exec & {
  args: [“sleep”, “10”]
}

message: engine.#Wait & {
  targets: [sleepShort.id, sleepLong.id]
  completed: engine.#Exec & {
    args: [“echo”, “done!”]
  }
}

User-defined actions can also be waited on: they simply need to implement the id: #TaskID interface.

For example docker.#Run would do:

package docker

#Run: {
  id: _exec.id
  // …
  _exec: engine.#Exec & {
    // …
  }
}

Option 2: if + status convention

This option requires the CUE if statement to be a reliable synchronization primitive.

All tasks have a status field filled by the runtime:

package engine

// The status of a task
#Status: “running” | “completed” | “failed” | “cancelled”

#Exec: {
  status: #Status
  …
}

#Pull: {
  status: #Status
  …
}
// etc.

Then waiting for a task is as simple as:

sleepShort: engine.#Exec & {
  args: [“sleep”, “1”]
}

sleepLong: engine.#Exec & {
  args: [“sleep”, “10”]
}

if sleepShort.status == “completed” && sleepLong.status == “completed” {
  completed: engine.#Exec & {
    args: [“echo”, “done!”]
  }
}

High-level actions could also be waited on, by implementing the status: engine.#Status convention. For example:

package docker

#Run: {
  status: _exec.status
  _exec: engine.#Exec & {
    …
  }
}

Option 3: Salt-style convention

Copied from @helderco’s comment below

  • require: Requires that a list of target tasks succeed before execution
  • require_in: Reverse dependency, is this even possible? Would be awesome if so.
  • require_any: What if you need an OR logic?
  • onfail: Execute only if a target task fails
  • onfail_in
  • onfail_any

As a global, every task would have to follow a convention, but looks cleaner when used (no need for wrapping):

package engine

#Task: {
    id: #TaskID
    #Requisites
    ...
} 

#TaskID: string

#Requisites: {
    require: [...#Task]
    require_any: [...#Task]
    onfail: [...#Task]
    onfail_any: [...#Task]
    ...
}

#Exec: #Task & {
    args: ...
    ....
}

Example usage:

sleepShort: engine.#Exec & {
  args: [“sleep”, “1”]
}

sleepLong: engine.#Exec & {
  args: [“sleep”, “10”]
}

engine.#Exec & {
    args: [“echo”, “done!”]
    require: [sleepShort, sleepLong]
}

And in user-land:

package docker

#Run: #Requisites & {
    id: _exec.id
    require: _
    // …
    _exec: engine.#Exec & {
        // …
        "require": require
    }
}

Copying all requisites like this doesn't look good, but maybe there's a simpler way to implement?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions