Tasks¶
A task is one execution of one tool — the smallest unit of work in VTK. Tasks belong to a job; files written by one task are visible to the next under the job’s working directory. See Concepts for the surrounding model.
A task in a job submission consists of a tool identifier, its parameters, and optionally a tool version:
{
"tool": "<tool>",
"parameters": {
"<param1>": "<value1>",
"<param2>": "<value2>"
},
"version": "<optional-tool-version>"
}
If version is omitted, the tool version pinned for your organization is used, or the latest available version. A complete list of available tools and their parameters is in the Tools section.
Tasks API¶
All task endpoints live underneath their parent job at https://api.vtk.castlabs.com.
List tasks of a job¶
GET /o/{organization_urn}/jobs/{job_id}/tasks
Permission: vtk:ListTasks on arn:vtk:api::{organization_urn}:jobs/{job_id}/tasks
Query parameters: limit, offset, status, tool (filter by tool name).
curl https://api.vtk.castlabs.com/o/urn:janus:organization:acme/jobs/abcDEFGHIjk/tasks \
-H 'Authorization: Bearer <token>' \
-H 'x-castlabs-organization: urn:janus:organization:acme'
Get task details¶
GET /o/{organization_urn}/jobs/{job_id}/tasks/{task_id}
Permission: vtk:GetTask on arn:vtk:api::{organization_urn}:jobs/{job_id}/tasks/{task_id}
Returns the resolved task: tool identifier, resolved version, status, and parameters.
Get task logs¶
GET /o/{organization_urn}/jobs/{job_id}/tasks/{task_id}/log
Permission: vtk:GetTaskLog on arn:vtk:api::{organization_urn}:jobs/{job_id}/tasks/{task_id}/log
Returns the log messages produced by the task during execution.
curl https://api.vtk.castlabs.com/o/urn:janus:organization:acme/jobs/abcDEFGHIjk/tasks/12345/log \
-H 'Authorization: Bearer <token>' \
-H 'x-castlabs-organization: urn:janus:organization:acme'
Response:
{
"count": 2,
"results": [
{ "timestamp": 1607374149000, "message": "Running tool..." },
{ "timestamp": 1607374152000, "message": "Done." }
]
}
Timestamps are Unix milliseconds.
Parallel task execution¶
Usually steps in the execution of a job depend on the previous step and are executed sequentially by default — for example:
the 2nd pass of an H.264 encoding is dependent on the 1st pass
the processing of a file depends on the download to be completed
But some steps don’t depend on each other and can be executed in parallel:
the download of a file from source S2 does not depend on the successful download of a file from S1
no H.264 2nd pass depends on the result of another 2nd pass
By adding |p to a tool name the execution engine treats that tool as not depending on the result of the previous tool.
[
{ "tool": "do:something" },
{ "tool": "do:something_else_in_parallel|p" }
]
A concrete example — two storage:get tasks fetching from different buckets at the same time:
{
"tasks": [
{ "tool": "storage:get",
"parameters": {
"location": "s3://{acme-aws-access-keys}@acme-bucket-a/in/",
"files": ["video.mp4"]
} },
{ "tool": "storage:get|p",
"parameters": {
"location": "s3://{acme-aws-access-keys}@acme-bucket-b/in/",
"files": ["audio.mp4"]
} }
]
}