Transcoding
Transcoding¶
FFmpeg¶
ffmpeg:cmd¶
The latest stable release of FFmpeg command line executable is available with an exhaustive list of Codec and Library Support. Usually available libav dependencies are:
libavutil
libavcodec
libavformat
libavdevice
libavfilter
libavresample
libswscale
libswresample
libpostproc
NOTE: castLabs is not responsible for paying royalties which may occur for the use of Codecs
Full documentation of the tool can be found in: https://ffmpeg.org/ffmpeg-all.html
Commandline arguments and possible values must be presented as a list.
Arguments:
"-<arg1>", "-<val1>", "-<arg2>", "-<val2>", ...
NOTE: By default we execute the tool to stop in case any decoding error happens (-xerror). Sometimes such decoding errors can be ignored without problems, but this must be enforced with boolean “ignore_decoding_errors”.
Example:
{
"tool": "ffmpeg:cmd",
"parameters": {
"arguments": [
"-i",
"tos_180s.mov",
"-f",
"mp4",
"encode/tos.mp4"
],
"ignore_decoding_errors": true
}
},
Parameter |
Properties |
Default |
Type |
Choice |
Description |
---|---|---|---|---|---|
arguments |
required [list value] |
||||
outputdir |
optional |
out |
str |
||
ignore_decoding_errors |
optional |
false |
bool |
ffmpeg:loudnorm_analyze¶
EBU R128 loudness normalization using ffmpeg’s loudnorm
filter - analysis pass.
Runs an analysis pass to gather information about the source audio.
Full documentation of the tool can be found here: ffmpeg - loudnorm
Optionally, the tool can run a second pass to apply the normalization to the audio. To do that, specify the “output” parameter with the desired ffmpeg arguments.
The data from the analysis pass will automatically be used to set the appropriate ffmpeg filter arguments.
Example:
{
"parameters": {
"input_file": "input.mp4",
"audio_selector": "0:a:0",
"env_prefix": "my_audio_",
"integrated_loudness_target: "-24.0",
"maximum_true_peak": "-2.0",
"loudness_range_target": "7.0",
"output": {
"ffmpeg_args": [
"-c:a", "libfdk_aac",
"-ar", "48000",
"-b:a", "128k",
"-ac", "2",
"out.m4a"
]
}
},
"tool": "ffmpeg:loudnorm_analyze"
}
Returns analysis result in environment variables using the specified prefix. The ffmpeg filter returns the following:
{
"input_i" : "-27.61",
"input_tp" : "-4.47",
"input_lra" : "18.06",
"input_thresh" : "-39.20",
"output_i" : "-16.58",
"output_tp" : "-1.50",
"output_lra" : "14.78",
"output_thresh" : "-27.71",
"normalization_type" : "dynamic",
"target_offset" : "0.58"
}
Given a prefix of “my_audio_” this tool sets the following environment variables:
my_audio_input_i: -27.61
my_audio_input_tp: -4.47
my_audio_input_lra: 18.06
my_audio_input_thresh: -39.20
my_audio_output_i: -16.58
my_audio_output_tp: -1.50
my_audio_output_lra: 14.78
my_audio_output_thresh: -27.71
my_audio_normalization_type: dynamic
my_audio_target_offset: 0.58
Parameter |
Properties |
Default |
Type |
Choice |
Description |
---|---|---|---|---|---|
input_file |
required |
input file to pass to ffmpeg |
|||
audio_selector |
optional |
0:a:0 |
str |
ffmpeg’s input selector to use to select the desired audio track |
|
env_prefix |
optional |
my_audio_ |
str |
prefix to use for environment variable names |
|
integrated_loudness_target |
optional |
-24.0 |
str |
integrated loudness target (I) |
|
maximum_true_peak |
optional |
-2.0 |
str |
maximum true peak (TP) |
|
loudness_range_target |
optional |
7.0 |
str |
loudness range target (LRA) |
|
output |
optional |
{} |
dict |
output settings |
Dolby¶
dolby_encoding:dv5_preproc¶
Dolby Vision profile 5 pre-processor is a command-line tool that converts Dolby Vision mezzanine input to profile 5 raw video (YUV 420 10-bit) and RPU metadata.
Full documentation: Dolby Encoding Engine Docs
{
"tool": "dolby_encoding:dv5_preproc",
"parameters": {
"decoder": "jpeg2000=kakadu_dlb:thread_num=4",
"input_video": "media/video.mxf",
"input_format": "jpeg2000_mxf",
"temp_dir": "/tmp",
"output_rpu": "/tmp/test.rpu"
"output": "named_pipe://PIPE:buffer_size=256000:timeout=5000",
"max_scene_frames": "96"
}
}
Parameter |
Properties |
Default |
Type |
Choice |
Description |
---|---|---|---|---|---|
input_format |
required |
Input format followed by format-specific options. |
|||
input_video |
required |
Input file or input directory, in case of list-based input. |
|||
output |
required |
str |
Output YUV file (10-bit 420 planar). |
||
output_rpu |
required |
Output RPU file |
|||
concurrent_processing |
optional |
1 |
int |
‘1’, ‘0’, 1, 0 |
Enable concurrent processing. |
show_frames |
optional |
none |
str |
‘all’, ‘counters’, ‘decoder’, ‘demuxer’, ‘none’, ‘resizer’, ‘transformer.’ |
Show more details about processed video frames. |
input_metadata |
optional |
Optional input metadata file. If not specified, the application attempts to extract metadata from the input. |
|||
metadata_offset |
optional |
0 |
int |
Offset added to each frame index, when accessing frame metadata from the source. |
|
start |
optional |
0 |
int |
Start frame. |
|
duration |
optional |
-1 |
int |
Duration (in frames). Value -1 means “up to the last frame”. |
|
decoder |
optional |
Decoder plugin and decoder options to be used. If not specified, the application attempts to select plugin automatically. |
|||
resize_options |
optional |
Resize options. Syntax: “option1=value1:option2=value2”. |
|||
max_scene_frames |
optional |
255 |
int |
Maximum number of frames processed as a single scene [1:255]. |
|
l11 |
optional |
auto |
str |
Dolby Vision level 11 metadata. |
|
l5 |
optional |
auto |
str |
Dolby Vision level 5 metadata. |
|
image_transformer |
optional |
Name of image transformer plugin to be used followed by configuration parameters. |
|||
temp_dir |
optional |
Directory to store temporary files. |
|||
keep_temp |
optional |
0 |
int |
‘1’, ‘0’, 1, 0 |
Keep temporary files after execution. |
dolby_encoding:ves_muxer¶
Dolby Vision VES muxer is a command-line tool that combines a base-layer elementary stream and potentially an enhancement-layer elementary stream with Dolby Vision RPU metadata.
Full documentation: Dolby Encoding Engine Docs
Example:
{
"tool": "dolby_encoding:ves_muxer",
"parameters": {
"input_bl": "video_in.h265",
"input_rpu": "test.rpu",
"overwrite": "1",
"output": "video_out.h265"
}
}
Parameter |
Properties |
Default |
Type |
Choice |
Description |
---|---|---|---|---|---|
input_bl |
required |
Input base-layer file. |
|||
output |
required |
Output bitstream file. |
|||
input_el |
optional |
Input enhancement-layer file. |
|||
input_rpu |
optional |
Input RPU file. |
dolby_encoding:post_process¶
Dolby Vision post-processor is a tool that post processes the muxed VES file produced by the VESMuxer tool.
Example:
{
"tool": "dolby_encoding:post_process",
"parameters": {
"input": "video_in.h265",
"dv_profile": "5",
"overwrite": "1",
"output": "video_out.h265"
}
}
Parameter |
Properties |
Default |
Type |
Choice |
Description |
---|---|---|---|---|---|
input |
required |
Input VES muxed file. |
|||
dv_profile |
required |
Dolby Vision bitstream profile ID. Values: 32.3 |
|||
output |
required |
Output bitstream file. |
|||
input_bl |
optional |
Input base-layer file.This option is only available for Dolby Vision profile 7. |
|||
compression |
optional |
0 |
int |
‘1’, ‘0’, 1, 0 |
RPU compression. Values: 0 means “no compression”, 1 means “compression by IDR”. This option is only available for Dolby Vision profile 32. Values: 0 |
update_l4 |
optional |
1 |
int |
‘1’, ‘0’, 1, 0 |
Update Dolby Vision level 4 metadata. |
update_l6 |
optional |
1 |
int |
‘1’, ‘0’, 1, 0 |
Update Dolby Vision level 6 metadata. |
max_cll |
optional |
auto |
str |
Maximum content light level and maximum frame average light level in units of 1 nit. |
|
l1_filtering |
optional |
auto |
str |
‘auto’, ‘none’, ‘sliding’ |
Filter type for metadata postprocessing. ‘auto’ means the filter is selected automatically based on ‘dv-profile’ value. |
l5 |
optional |
auto |
str |
Target active area offset. Dolby Vision level 5 metadata represented by comma-separated integers. Each integer represents one of the letterbox’s margins in pixels: Syntax: |
|
l11 |
optional |
auto |
str |
Dolby Vision level 11 metadata represented by comma-separated integers: Syntax: |
|
scale_metadata |
optional |
auto |
str |
Metadata scaling, from source resolution to target resolution |
|
output_bl |
optional |
Output post-processed base-layer file. This option is only available for Dolby Vision profile 7. |
Metadata¶
vorbis:cover_art¶
Embeds one or more JPEG or PNG images into an OGG file.
This is achieved by placing on or more base64 encoded binary FLAC picture structures within VorbisComment with tag
names METADATA_BLOCK_PICTURE
.
You can instruct the tool to remove existing METADATA_BLOCK_PICTURE
comments or leave them alone.
IMAGES: list of dictionaries which need to have the following keys:
`input_img` input image. Supported formats: PNG, JPEG
`img_type` ID3v2 type. Can be one of:
0 - Other
1 - 32x32 pixels 'file icon' (PNG only)
2 - Other file icon
3 - Cover (front)
4 - Cover (back)
5 - Leaflet page
6 - Media (e.g. label side of CD)
7 - Lead artist/lead performer/soloist
8 - Artist/performer
9 - Conductor
10 - Band/Orchestra
11 - Composer
12 - Lyricist/text writer
13 - Recording Location
14 - During recording
15 - During performance
16 - Movie/video screen capture
17 - A bright coloured fish
18 - Illustration
19 - Band/artist logotype
20 - Publisher/Studio logotype
`description` Optional description of the image
Example:
{
"tool": "vorbis:cover_art",
"parameters": {
"input_ogg": "in_file.ogg",
"output_ogg": "out_file.ogg",
"remove_existing_pictures": true,
"images": [
{
"input_img": "front_cover.jpg",
"img_type": 3,
"description": "Front cover of the album"
},
{
"input_img": "back_cover.jpg",
"img_type": 4,
}
]
}
}
Parameter |
Properties |
Default |
Type |
Choice |
Description |
---|---|---|---|---|---|
images |
required [list value] |
dict |
List of dictionaries representing keys, see IMAGES above |
||
input_ogg |
required |
str |
input ogg file to process |
||
output_ogg |
required |
str |
output ogg file to generate |
||
remove_existing_pictures |
optional |
false |
bool |
removes existing |
Subtitles¶
captions:ccextractor¶
We deploy a forked version of this tool http://www.ccextractor.org:
https://github.com/encoreinteractive/ccextractor.git
Extracts closed captions and teletext subtitles from video streams.
(DVB, .TS, ReplayTV 4000 and 5000, dvr-ms, bttv, Tivo, Dish Network, .mp4, HDHomeRun are known to work).
Syntax: ccextractor [options] inputfile1 [inputfile2…] [-o outputfilename]
Commandline arguments must be presented as a list.
Parameters:
"<key>": "<value>"
Arguments:
"-<arg1>", "-<arg2>", ...
Example:
{
"tool": "captions:ccextractor",
"parameters": {
"arguments": [
"movie1.mp4",
"-out=webvtt-full",
"--webvtt-no-line",
"--webvtt-no-css",
"-o",
"movie1-en.vtt"
]
}
},
Parameter |
Properties |
Default |
Type |
Choice |
Description |
---|---|---|---|---|---|
arguments |
required [list value] |
captions:convert¶
Captions and Subtitles Converter
Example:
{
"parameters": {
"inputfile": "IN/tearsofsteel_4k_eng.srt",
"language": "eng",
"outputformat": "WEBVTT",
"outputfile": "tearsofsteel-test.vtt"
},
"tool": "captions:convert"
},
Parameter |
Properties |
Default |
Type |
Choice |
Description |
---|---|---|---|---|---|
inputfile |
required |
||||
outputfile |
required |
||||
outputformat |
required |
‘WEBVTT’, ‘SAMI’, ‘DFXP’, ‘SRT’, ‘SCC’, ‘SST’ |
|||
language |
optional |
set language tag for output (auto detection if supported by format) |
|||
offset |
optional |
PT0S |
timedelta |
||
force_input_language |
optional |
overwrite language tag for output if autodetection fails (if supported by format) |
|||
strip_html_tags |
optional |
false |
bool |
strip HTML tags when converting. |
|
remove_layout_info |
optional |
false |
bool |
remove the layout information from subtitles. This ensures output subtitles will not have any layout info. |
|
merge_captions |
optional |
false |
bool |
merge captions with the same start time. |
|
merge_captions_layout |
optional |
false |
bool |
when merging captions, merge the layout as well. If true, the layout of the first caption will be used. |
|
vtt_force_hours |
optional |
false |
bool |
when outputting to WEBVTT, force writing full timestamps (hh:mm:ss.xxx) even if the “hh” part is 00 |
|
vtt_sequence_numbers |
optional |
false |
bool |
include sequence numbers when writing WEBVTT format. |
|
strip_ass_tags |
optional |
false |
bool |
strip ASS tags when converting. |
|
video_width |
optional |
int |
width of the video the subtitles will be displayed on. If specified, height is required too. |
||
video_height |
optional |
int |
height of the video the subtitles will be displayed on. If specified, width is required too. |
||
position |
optional |
bottom |
str |
‘bottom’, ‘top’, ‘source’ |
specifies the position of subtitles. Currently limited to SST and SRT output subtitle types. Currently |
avoid_same_start_prev_end |
optional |
true |
bool |
avoid start time of a subtitle being the same as end time of previous one. Currently only supported when output format is SST |
|
scenarist_compat |
optional |
false |
bool |
activate compatibility mode when writing SST format. Should produce more compatible files, but can break older systems. |
|
tiff_compression |
optional |
tiff_deflate |
str |
‘tiff_deflate’, ‘raw’ |
TIFF compression to use when output format is SST. |
Thumbnails¶
thumbs:generate¶
Generates thumbnails from a video file to either include in packaging or for side loading.
Example:
{
"tool": "thumbs:generate",
"parameters": {
"inputfile": "tos_5s_video_1920x1080_2mbps.mp4"
}
},
Parameter |
Properties |
Default |
Type |
Choice |
Description |
---|---|---|---|---|---|
inputfile |
required |
Video file to extract thumbnails. |
|||
duration |
optional |
5 |
int |
Duration per thumbnail in seconds. 1 = 1 thumb per seconds, 10 = 1 thumb per 10 seconds |
|
outputdir |
optional |
thumbnails |
str |
Output directory |
|
height |
optional |
180 |
int |
||
grid_width |
optional |
8 |
int |
||
grid_height |
optional |
8 |
int |
||
filename |
optional |
thumbs |
str |
||
quality |
optional |
60 |
int |
||
skip_generating |
optional |
false |
bool |
Skip thumbnail generation. Assumes “outputdir” has thumbnails named using ffmpeg pattern %d.jpg |
Transcode¶
transcode:cropdetect¶
Automatic video frame cropping detection based on ffmpeg commandline
Full documentation of the tool`s crop detection can be found here: ffmpeg - cropdetect
This implementation covers accurate detection of black bars across entire movies, with predefined parameters. The output output will be stored as environment variable “cropdetect” which can be used for cropping instructions in subsequent ffmpeg commands.
Example:
{
"parameters": {
"inputfile": "movie.mov"
},
"tool": "transcode:cropdetect"
},
Returns frame dimensions without black bars, width:height:columns_indent:rows_indent, for example:
crop=1920:1072:0:4
Example for cropping instructions with that output in a later ffmpeg command:
{
"tool": "ffmpeg:cmd",
"parameters": {
"ignore_decoding_errors": "false",
"arguments": [
"-i",
"{input}",
"-filter_complex", ""[0:v]{cropdetect}[cropped];[cropped]scale='(trunc((ih*((16/9)/2))*2))':ih,setsar=1[cropped]"",
"-map", "[cropped]
"-f",
"mp4",
"{output}"
]
}
},
Parameter |
Properties |
Default |
Type |
Choice |
Description |
---|---|---|---|---|---|
inputfile |
required |
Watermarking¶
content_armor:profile¶
ContentArmor’s Profiling
tool.
The profiling tool prepares the content for embedding.
input
and output
are single video files.
You can read more about ContentArmor here: ContentArmor
Example:
{
"tool": "content_armor:profile",
"parameters": {
"input_file": "tos.mp4",
"output": "operator_inv",
"payload_size": 16,
"content_name": "Tears of Steel",
"store_location": "store/operator_inv",
"license": {CA_PROF_LIC}
}
Parameter |
Properties |
Default |
Type |
Choice |
Description |
---|---|---|---|---|---|
input_file |
required |
Video file which conforms to ContentArmor specification, i.e. AVC/HEVC containing non-reference B frames (mp4, ts, 264, 265, avc, hevc, mkv). In ABR SEI aligned mode, the input file must be a DASH manifest or HLS playlist. |
|||
content_name |
required |
A name to be used to archive the processed asset |
|||
output_file |
optional |
profiled.mp4 |
str |
profiled file name. Ignored in ABR SEI aligned mode |
|
output_folder |
optional |
profiled |
str |
output folder for profiled file. Only used in ABR SEI aligned mode. |
|
export_work |
optional |
export_work.tar.gz |
str |
extraction helper file |
|
license |
optional |
str |
Content-Armor profiler license |
||
payload_size |
optional |
16 |
int |
Payload size of the mark in bits (4/8/16/24/32/48) |
|
embed_encrypt |
optional |
false |
bool |
Profile for embedding in the encrypted domain. Defaults to False |
|
abr_sei_aligned |
optional |
false |
bool |
Activate ABR SEI aligned mode in Content Armor. Defaults to False |
content_armor:ABIngestCL¶
ContentArmor’s ABIngest
tool with modifications from castLabs.
These modifications include:
processing audio
creating proper output file names complying with the DASH-IF AB Ingest specification (i.e. suitable for Akamai).
support for additional DASH flavours
proper processing of the export file
creating playlists as for distribution also helper ones for VTK packaging jobs
The A/B Ingest tool consists of a profiler and an embedder.
The input to the tool is a video playlist (HLS/DASH, segmented of fragmented), the output:
pre-watermarked video A
pre-watermarked video B
output playlist which may be directly served via CDN
watermark forensic metadata (WFM) corresponding to the input video This file is used for online detection of watermarks.
You can read more about ContentArmor here: ContentArmor
Example:
{
"tool": "content_armor:ABIngestCL",
"parameters": {
"input_file": "tos.m3u8",
"output": "ABinv",
"payload_size": 16,
"content_name": "Tears of Steel",
"store_location": "store/ABinv"
}
Parameter |
Properties |
Default |
Type |
Choice |
Description |
---|---|---|---|---|---|
input_file |
required |
Input playlist or manifest (M3U8: variant or content playlist, MPD: type static) |
|||
output |
required |
Output folder |
|||
content_name |
required |
content identifier for a storage |
|||
payload_size |
optional |
16 |
int |
Payload size of the mark in bits (4/8/16/24/32/48) |
|
store_location |
optional |
store |
str |
Folder where extraction helper information will be stored |
|
profiler_license |
optional |
str |
Content-Armor profiler license |
||
embedder_license |
optional |
str |
Content-Armor embedder license |
content_armor:embed¶
ContentArmor’s embedding tool.
The input to the tool is a profiled video file (forensic mode).
You can read more about ContentArmor here: ContentArmor
Example:
{
"tool": "content_armor:embed",
"parameters": {
"input_file": "tos_profiled.mp4",
"output_file": "tos_cafe.mp4",
"wm_id": cafe,
"license": {CA_EMB_LIC}
}
Parameter |
Properties |
Default |
Type |
Choice |
Description |
---|---|---|---|---|---|
input_file |
required |
Profiled video file. (Supported: mp4, ts, 264, 265, avc, hevc, mkv) |
|||
output_file |
required |
Marked video file |
|||
wm_id |
optional |
Watermark ID in hex format without ‘0x’ i.e. 2a23 |
|||
license |
optional |
str |
Content-Armor embedder license |
||
verbose |
optional |
false |
bool |
More verbose |