Tasks

A task is one execution of one tool — the smallest unit of work in VTK. Tasks belong to a job; files written by one task are visible to the next under the job’s working directory. See Concepts for the surrounding model.

A task in a job submission consists of a tool identifier, its parameters, and optionally a tool version:

{
    "tool": "<tool>",
    "parameters": {
        "<param1>": "<value1>",
        "<param2>": "<value2>"
    },
    "version": "<optional-tool-version>"
}

If version is omitted, the tool version pinned for your organization is used, or the latest available version. A complete list of available tools and their parameters is in the Tools section.

Tasks API

All task endpoints live underneath their parent job at https://api.vtk.castlabs.com.

List tasks of a job

GET /o/{organization_urn}/jobs/{job_id}/tasks

Permission: vtk:ListTasks on arn:vtk:api::{organization_urn}:jobs/{job_id}/tasks

Query parameters: limit, offset, status, tool (filter by tool name).

curl https://api.vtk.castlabs.com/o/urn:janus:organization:acme/jobs/abcDEFGHIjk/tasks \
  -H 'Authorization: Bearer <token>' \
  -H 'x-castlabs-organization: urn:janus:organization:acme'

Get task details

GET /o/{organization_urn}/jobs/{job_id}/tasks/{task_id}

Permission: vtk:GetTask on arn:vtk:api::{organization_urn}:jobs/{job_id}/tasks/{task_id}

Returns the resolved task: tool identifier, resolved version, status, and parameters.

Get task logs

GET /o/{organization_urn}/jobs/{job_id}/tasks/{task_id}/log

Permission: vtk:GetTaskLog on arn:vtk:api::{organization_urn}:jobs/{job_id}/tasks/{task_id}/log

Returns the log messages produced by the task during execution.

curl https://api.vtk.castlabs.com/o/urn:janus:organization:acme/jobs/abcDEFGHIjk/tasks/12345/log \
  -H 'Authorization: Bearer <token>' \
  -H 'x-castlabs-organization: urn:janus:organization:acme'

Response:

{
  "count": 2,
  "results": [
    { "timestamp": 1607374149000, "message": "Running tool..." },
    { "timestamp": 1607374152000, "message": "Done." }
  ]
}

Timestamps are Unix milliseconds.

Parallel task execution

Usually steps in the execution of a job depend on the previous step and are executed sequentially by default — for example:

  • the 2nd pass of an H.264 encoding is dependent on the 1st pass

  • the processing of a file depends on the download to be completed

But some steps don’t depend on each other and can be executed in parallel:

  • the download of a file from source S2 does not depend on the successful download of a file from S1

  • no H.264 2nd pass depends on the result of another 2nd pass

By adding |p to a tool name the execution engine treats that tool as not depending on the result of the previous tool.

[
    { "tool": "do:something" },
    { "tool": "do:something_else_in_parallel|p" }
]

A concrete example — two storage:get tasks fetching from different buckets at the same time:

{
    "tasks": [
        { "tool": "storage:get",
          "parameters": {
              "location": "s3://{acme-aws-access-keys}@acme-bucket-a/in/",
              "files": ["video.mp4"]
          } },
        { "tool": "storage:get|p",
          "parameters": {
              "location": "s3://{acme-aws-access-keys}@acme-bucket-b/in/",
              "files": ["audio.mp4"]
          } }
    ]
}
Next topic: Status
Previous topic: Jobs