Transcoding

Transcoding

FFmpeg

ffmpeg:cmd

The latest stable release of FFmpeg command line executable is available with an exhaustive list of Codec and Library Support. Usually available libav dependencies are:
  • libavutil

  • libavcodec

  • libavformat

  • libavdevice

  • libavfilter

  • libavresample

  • libswscale

  • libswresample

  • libpostproc

    NOTE: castLabs is not responsible for paying royalties which may occur for the use of Codecs

Full documentation of the tool can be found in: https://ffmpeg.org/ffmpeg-all.html

Commandline arguments and possible values must be presented as a list.

Arguments:

"-<arg1>", "-<val1>", "-<arg2>", "-<val2>", ...

NOTE: By default we execute the tool to stop in case any decoding error happens (-xerror). Sometimes such decoding errors can be ignored without problems, but this must be enforced with boolean “ignore_decoding_errors”.

Example:

{
  "tool": "ffmpeg:cmd",
  "parameters": {
    "arguments": [
      "-i",
      "tos_180s.mov",
      "-f",
      "mp4",
      "encode/tos.mp4"
    ],
    "ignore_decoding_errors": true
  }
},

Parameter

Properties

Default

Type

Choice

Description

arguments

required [list value]

outputdir

optional

out

str

ignore_decoding_errors

optional

false

bool

ffmpeg:loudnorm_analyze

EBU R128 loudness normalization using ffmpeg’s loudnorm filter - analysis pass. Runs an analysis pass to gather information about the source audio.

Full documentation of the tool can be found here: ffmpeg - loudnorm

Optionally, the tool can run a second pass to apply the normalization to the audio. To do that, specify the “output” parameter with the desired ffmpeg arguments.

The data from the analysis pass will automatically be used to set the appropriate ffmpeg filter arguments.

Example:

{
    "parameters": {
        "input_file": "input.mp4",
        "audio_selector": "0:a:0",
        "env_prefix": "my_audio_",
        "integrated_loudness_target: "-24.0",
        "maximum_true_peak": "-2.0",
        "loudness_range_target": "7.0",
        "output": {
            "ffmpeg_args": [
                "-c:a", "libfdk_aac",
                "-ar", "48000",
                "-b:a", "128k",
                "-ac", "2",
                "out.m4a"
            ]
        }
    },
"tool": "ffmpeg:loudnorm_analyze"
}

Returns analysis result in environment variables using the specified prefix. The ffmpeg filter returns the following:

{
    "input_i" : "-27.61",
    "input_tp" : "-4.47",
    "input_lra" : "18.06",
    "input_thresh" : "-39.20",
    "output_i" : "-16.58",
    "output_tp" : "-1.50",
    "output_lra" : "14.78",
    "output_thresh" : "-27.71",
    "normalization_type" : "dynamic",
    "target_offset" : "0.58"
}

Given a prefix of “my_audio_” this tool sets the following environment variables:

my_audio_input_i: -27.61
my_audio_input_tp: -4.47
my_audio_input_lra: 18.06
my_audio_input_thresh: -39.20
my_audio_output_i: -16.58
my_audio_output_tp: -1.50
my_audio_output_lra: 14.78
my_audio_output_thresh: -27.71
my_audio_normalization_type: dynamic
my_audio_target_offset: 0.58

Parameter

Properties

Default

Type

Choice

Description

input_file

required

input file to pass to ffmpeg

audio_selector

optional

0:a:0

str

ffmpeg’s input selector to use to select the desired audio track

env_prefix

optional

my_audio_

str

prefix to use for environment variable names

integrated_loudness_target

optional

-24.0

str

integrated loudness target (I)

maximum_true_peak

optional

-2.0

str

maximum true peak (TP)

loudness_range_target

optional

7.0

str

loudness range target (LRA)

output

optional

{}

dict

output settings

Dolby

dolby_encoding:dv5_preproc

Dolby Vision profile 5 pre-processor is a command-line tool that converts Dolby Vision mezzanine input to profile 5 raw video (YUV 420 10-bit) and RPU metadata.

Full documentation: Dolby Encoding Engine Docs

{
  "tool": "dolby_encoding:dv5_preproc",
  "parameters": {
        "decoder": "jpeg2000=kakadu_dlb:thread_num=4",
        "input_video": "media/video.mxf",
        "input_format": "jpeg2000_mxf",
        "temp_dir": "/tmp",
        "output_rpu": "/tmp/test.rpu"
        "output": "named_pipe://PIPE:buffer_size=256000:timeout=5000",
        "max_scene_frames": "96"
    }
}

Parameter

Properties

Default

Type

Choice

Description

input_format

required

Input format followed by format-specific options.

input_video

required

Input file or input directory, in case of list-based input.

output

required

str

Output YUV file (10-bit 420 planar).

output_rpu

required

Output RPU file

concurrent_processing

optional

1

int

‘1’, ‘0’, 1, 0

Enable concurrent processing.

show_frames

optional

none

str

‘all’, ‘counters’, ‘decoder’, ‘demuxer’, ‘none’, ‘resizer’, ‘transformer.’

Show more details about processed video frames.

input_metadata

optional

Optional input metadata file. If not specified, the application attempts to extract metadata from the input.

metadata_offset

optional

0

int

Offset added to each frame index, when accessing frame metadata from the source.

start

optional

0

int

Start frame.

duration

optional

-1

int

Duration (in frames). Value -1 means “up to the last frame”.

decoder

optional

Decoder plugin and decoder options to be used. If not specified, the application attempts to select plugin automatically.

resize_options

optional

Resize options. Syntax: “option1=value1:option2=value2”.

max_scene_frames

optional

255

int

Maximum number of frames processed as a single scene [1:255].

l11

optional

auto

str

Dolby Vision level 11 metadata.

l5

optional

auto

str

Dolby Vision level 5 metadata.

image_transformer

optional

Name of image transformer plugin to be used followed by configuration parameters.

temp_dir

optional

Directory to store temporary files.

keep_temp

optional

0

int

‘1’, ‘0’, 1, 0

Keep temporary files after execution.

dolby_encoding:ves_muxer

Dolby Vision VES muxer is a command-line tool that combines a base-layer elementary stream and potentially an enhancement-layer elementary stream with Dolby Vision RPU metadata.

Full documentation: Dolby Encoding Engine Docs

Example:

{
  "tool": "dolby_encoding:ves_muxer",
  "parameters": {
        "input_bl": "video_in.h265",
        "input_rpu": "test.rpu",
        "overwrite": "1",
        "output": "video_out.h265"
    }
}

Parameter

Properties

Default

Type

Choice

Description

input_bl

required

Input base-layer file.

output

required

Output bitstream file.

input_el

optional

Input enhancement-layer file.

input_rpu

optional

Input RPU file.

dolby_encoding:post_process

Dolby Vision post-processor is a tool that post processes the muxed VES file produced by the VESMuxer tool.

Example:

{
  "tool": "dolby_encoding:post_process",
  "parameters": {
        "input": "video_in.h265",
        "dv_profile": "5",
        "overwrite": "1",
        "output": "video_out.h265"
    }
}

Parameter

Properties

Default

Type

Choice

Description

input

required

Input VES muxed file.

dv_profile

required

Dolby Vision bitstream profile ID. Values: 32.3

output

required

Output bitstream file.

input_bl

optional

Input base-layer file.This option is only available for Dolby Vision profile 7.

compression

optional

0

int

‘1’, ‘0’, 1, 0

RPU compression. Values: 0 means “no compression”, 1 means “compression by IDR”. This option is only available for Dolby Vision profile 32. Values: 0

update_l4

optional

1

int

‘1’, ‘0’, 1, 0

Update Dolby Vision level 4 metadata.

update_l6

optional

1

int

‘1’, ‘0’, 1, 0

Update Dolby Vision level 6 metadata.

max_cll

optional

auto

str

Maximum content light level and maximum frame average light level in units of 1 nit.

l1_filtering

optional

auto

str

‘auto’, ‘none’, ‘sliding’

Filter type for metadata postprocessing. ‘auto’ means the filter is selected automatically based on ‘dv-profile’ value.

l5

optional

auto

str

Target active area offset. Dolby Vision level 5 metadata represented by comma-separated integers. Each integer represents one of the letterbox’s margins in pixels: Syntax: ,,,

l11

optional

auto

str

Dolby Vision level 11 metadata represented by comma-separated integers: Syntax: ,,.

scale_metadata

optional

auto

str

Metadata scaling, from source resolution to target resolution

output_bl

optional

Output post-processed base-layer file. This option is only available for Dolby Vision profile 7.

Metadata

vorbis:cover_art

Embeds one or more JPEG or PNG images into an OGG file.

This is achieved by placing on or more base64 encoded binary FLAC picture structures within VorbisComment with tag names METADATA_BLOCK_PICTURE.

You can instruct the tool to remove existing METADATA_BLOCK_PICTURE comments or leave them alone.

IMAGES: list of dictionaries which need to have the following keys:

`input_img`     input image. Supported formats: PNG, JPEG
`img_type`      ID3v2 type. Can be one of:
    0 - Other
    1 - 32x32 pixels 'file icon' (PNG only)
    2 - Other file icon
    3 - Cover (front)
    4 - Cover (back)
    5 - Leaflet page
    6 - Media (e.g. label side of CD)
    7 - Lead artist/lead performer/soloist
    8 - Artist/performer
    9 - Conductor
    10 - Band/Orchestra
    11 - Composer
    12 - Lyricist/text writer
    13 - Recording Location
    14 - During recording
    15 - During performance
    16 - Movie/video screen capture
    17 - A bright coloured fish
    18 - Illustration
    19 - Band/artist logotype
    20 - Publisher/Studio logotype

`description`   Optional description of the image

Example:

{
  "tool": "vorbis:cover_art",
  "parameters": {
    "input_ogg": "in_file.ogg",
    "output_ogg": "out_file.ogg",
    "remove_existing_pictures": true,
    "images": [
        {
            "input_img": "front_cover.jpg",
            "img_type": 3,
            "description": "Front cover of the album"
        },
        {
            "input_img": "back_cover.jpg",
            "img_type": 4,
        }
    ]
  }
}

Parameter

Properties

Default

Type

Choice

Description

images

required [list value]

dict

List of dictionaries representing keys, see IMAGES above

input_ogg

required

str

input ogg file to process

output_ogg

required

str

output ogg file to generate

remove_existing_pictures

optional

false

bool

removes existing METADATA_BLOCK_PICTURE comments

Subtitles

captions:ccextractor

We deploy a forked version of this tool http://www.ccextractor.org:
https://github.com/encoreinteractive/ccextractor.git
Extracts closed captions and teletext subtitles from video streams.
(DVB, .TS, ReplayTV 4000 and 5000, dvr-ms, bttv, Tivo, Dish Network, .mp4, HDHomeRun are known to work).

Syntax: ccextractor [options] inputfile1 [inputfile2…] [-o outputfilename]

Commandline arguments must be presented as a list.

Parameters:

"<key>": "<value>"

Arguments:

"-<arg1>", "-<arg2>", ...

Example:

{
  "tool": "captions:ccextractor",
  "parameters": {
    "arguments": [
      "movie1.mp4",
      "-out=webvtt-full",
       "--webvtt-no-line",
       "--webvtt-no-css",
       "-o",
       "movie1-en.vtt"
    ]
  }
},

Parameter

Properties

Default

Type

Choice

Description

arguments

required [list value]

captions:convert

Captions and Subtitles Converter

Example:

{
    "parameters": {
        "inputfile": "IN/tearsofsteel_4k_eng.srt",
        "language": "eng",
        "outputformat": "WEBVTT",

        "outputfile": "tearsofsteel-test.vtt"
  },
  "tool": "captions:convert"
},

Parameter

Properties

Default

Type

Choice

Description

inputfile

required

outputfile

required

outputformat

required

‘WEBVTT’, ‘SAMI’, ‘DFXP’, ‘SRT’, ‘SCC’, ‘SST’

language

optional

set language tag for output (auto detection if supported by format)

offset

optional

PT0S

timedelta

force_input_language

optional

overwrite language tag for output if autodetection fails (if supported by format)

strip_html_tags

optional

false

bool

strip HTML tags when converting.

remove_layout_info

optional

false

bool

remove the layout information from subtitles. This ensures output subtitles will not have any layout info.

merge_captions

optional

false

bool

merge captions with the same start time.

merge_captions_layout

optional

false

bool

when merging captions, merge the layout as well. If true, the layout of the first caption will be used.

vtt_force_hours

optional

false

bool

when outputting to WEBVTT, force writing full timestamps (hh:mm:ss.xxx) even if the “hh” part is 00

vtt_sequence_numbers

optional

false

bool

include sequence numbers when writing WEBVTT format.

strip_ass_tags

optional

false

bool

strip ASS tags when converting.

video_width

optional

int

width of the video the subtitles will be displayed on. If specified, height is required too.

video_height

optional

int

height of the video the subtitles will be displayed on. If specified, width is required too.

position

optional

bottom

str

‘bottom’, ‘top’, ‘source’

specifies the position of subtitles. Currently limited to SST and SRT output subtitle types. Currently source is only supported for SST.

avoid_same_start_prev_end

optional

true

bool

avoid start time of a subtitle being the same as end time of previous one. Currently only supported when output format is SST

scenarist_compat

optional

false

bool

activate compatibility mode when writing SST format. Should produce more compatible files, but can break older systems.

tiff_compression

optional

tiff_deflate

str

‘tiff_deflate’, ‘raw’

TIFF compression to use when output format is SST.

Thumbnails

thumbs:generate

Generates thumbnails from a video file to either include in packaging or for side loading.

Example:

{
  "tool": "thumbs:generate",
      "parameters": {
        "inputfile": "tos_5s_video_1920x1080_2mbps.mp4"
      }
},

Parameter

Properties

Default

Type

Choice

Description

inputfile

required

Video file to extract thumbnails.

duration

optional

5

int

Duration per thumbnail in seconds. 1 = 1 thumb per seconds, 10 = 1 thumb per 10 seconds

outputdir

optional

thumbnails

str

Output directory

height

optional

180

int

grid_width

optional

8

int

grid_height

optional

8

int

filename

optional

thumbs

str

quality

optional

60

int

skip_generating

optional

false

bool

Skip thumbnail generation. Assumes “outputdir” has thumbnails named using ffmpeg pattern %d.jpg

Transcode

transcode:cropdetect

Automatic video frame cropping detection based on ffmpeg commandline

Full documentation of the tool`s crop detection can be found here: ffmpeg - cropdetect

This implementation covers accurate detection of black bars across entire movies, with predefined parameters. The output output will be stored as environment variable “cropdetect” which can be used for cropping instructions in subsequent ffmpeg commands.

Example:

{
  "parameters": {
    "inputfile": "movie.mov"
  },
  "tool": "transcode:cropdetect"
},

Returns frame dimensions without black bars, width:height:columns_indent:rows_indent, for example:

crop=1920:1072:0:4

Example for cropping instructions with that output in a later ffmpeg command:

{
  "tool": "ffmpeg:cmd",
  "parameters": {
    "ignore_decoding_errors": "false",
    "arguments": [
      "-i",
      "{input}",
      "-filter_complex", ""[0:v]{cropdetect}[cropped];[cropped]scale='(trunc((ih*((16/9)/2))*2))':ih,setsar=1[cropped]"",
      "-map", "[cropped]
      "-f",
      "mp4",
      "{output}"
    ]
  }
},

Parameter

Properties

Default

Type

Choice

Description

inputfile

required

Watermarking

content_armor:profile

ContentArmor’s Profiling tool.

The profiling tool prepares the content for embedding.

input and output are single video files.

You can read more about ContentArmor here: ContentArmor

Example:

{
  "tool": "content_armor:profile",
  "parameters": {
        "input_file": "tos.mp4",
        "output": "operator_inv",
        "payload_size": 16,
        "content_name": "Tears of Steel",
        "store_location": "store/operator_inv",
        "license": {CA_PROF_LIC}
}

Parameter

Properties

Default

Type

Choice

Description

input_file

required

Video file which conforms to ContentArmor specification, i.e. AVC/HEVC containing non-reference B frames (mp4, ts, 264, 265, avc, hevc, mkv). In ABR SEI aligned mode, the input file must be a DASH manifest or HLS playlist.

content_name

required

A name to be used to archive the processed asset

output_file

optional

profiled.mp4

str

profiled file name. Ignored in ABR SEI aligned mode

output_folder

optional

profiled

str

output folder for profiled file. Only used in ABR SEI aligned mode.

export_work

optional

export_work.tar.gz

str

extraction helper file

license

optional

str

Content-Armor profiler license

payload_size

optional

16

int

Payload size of the mark in bits (4/8/16/24/32/48)

embed_encrypt

optional

false

bool

Profile for embedding in the encrypted domain. Defaults to False

abr_sei_aligned

optional

false

bool

Activate ABR SEI aligned mode in Content Armor. Defaults to False

content_armor:ABIngestCL

ContentArmor’s ABIngest tool with modifications from castLabs.

These modifications include:

  • processing audio

  • creating proper output file names complying with the DASH-IF AB Ingest specification (i.e. suitable for Akamai).

  • support for additional DASH flavours

  • proper processing of the export file

  • creating playlists as for distribution also helper ones for VTK packaging jobs

The A/B Ingest tool consists of a profiler and an embedder.

The input to the tool is a video playlist (HLS/DASH, segmented of fragmented), the output:

  • pre-watermarked video A

  • pre-watermarked video B

  • output playlist which may be directly served via CDN

  • watermark forensic metadata (WFM) corresponding to the input video This file is used for online detection of watermarks.

You can read more about ContentArmor here: ContentArmor

Example:

{
  "tool": "content_armor:ABIngestCL",
  "parameters": {
        "input_file": "tos.m3u8",
        "output": "ABinv",
        "payload_size": 16,
        "content_name": "Tears of Steel",
        "store_location": "store/ABinv"
}

Parameter

Properties

Default

Type

Choice

Description

input_file

required

Input playlist or manifest (M3U8: variant or content playlist, MPD: type static)

output

required

Output folder

content_name

required

content identifier for a storage

payload_size

optional

16

int

Payload size of the mark in bits (4/8/16/24/32/48)

store_location

optional

store

str

Folder where extraction helper information will be stored

profiler_license

optional

str

Content-Armor profiler license

embedder_license

optional

str

Content-Armor embedder license

content_armor:embed

ContentArmor’s embedding tool.

The input to the tool is a profiled video file (forensic mode).

You can read more about ContentArmor here: ContentArmor

Example:

{
  "tool": "content_armor:embed",
  "parameters": {
        "input_file": "tos_profiled.mp4",
        "output_file": "tos_cafe.mp4",
        "wm_id": cafe,
        "license": {CA_EMB_LIC}
}

Parameter

Properties

Default

Type

Choice

Description

input_file

required

Profiled video file. (Supported: mp4, ts, 264, 265, avc, hevc, mkv)

output_file

required

Marked video file

wm_id

optional

Watermark ID in hex format without ‘0x’ i.e. 2a23

license

optional

str

Content-Armor embedder license

verbose

optional

false

bool

More verbose

Previous topic: DRMtoday
Next topic: File Transfer