Transcoding

Transcoding¶

FFmpeg¶

ffmpeg:loudnorm_analyze¶

EBU R128 loudness normalization using ffmpeg’s loudnorm filter - analysis pass. Runs an analysis pass to gather information about the source audio.

Full documentation of the tool can be found here: ffmpeg - loudnorm

Optionally, the tool can run a second pass to apply the normalization to the audio. To do that, specify the “output” parameter with the desired ffmpeg arguments.

The data from the analysis pass will automatically be used to set the appropriate ffmpeg filter arguments.

Example:

{
    "parameters": {
        "input_file": "input.mp4",
        "audio_selector": "0:a:0",
        "env_prefix": "my_audio_",
        "integrated_loudness_target: "-24.0",
        "maximum_true_peak": "-2.0",
        "loudness_range_target": "7.0",
        "output": {
            "ffmpeg_args": [
                "-c:a", "libfdk_aac",
                "-ar", "48000",
                "-b:a", "128k",
                "-ac", "2",
                "out.m4a"
            ]
        }
    },
"tool": "ffmpeg:loudnorm_analyze"
}

Returns analysis result in environment variables using the specified prefix. The ffmpeg filter returns the following:

{
    "input_i" : "-27.61",
    "input_tp" : "-4.47",
    "input_lra" : "18.06",
    "input_thresh" : "-39.20",
    "output_i" : "-16.58",
    "output_tp" : "-1.50",
    "output_lra" : "14.78",
    "output_thresh" : "-27.71",
    "normalization_type" : "dynamic",
    "target_offset" : "0.58"
}

Given a prefix of “my_audio_” this tool sets the following environment variables:

my_audio_input_i: -27.61
my_audio_input_tp: -4.47
my_audio_input_lra: 18.06
my_audio_input_thresh: -39.20
my_audio_output_i: -16.58
my_audio_output_tp: -1.50
my_audio_output_lra: 14.78
my_audio_output_thresh: -27.71
my_audio_normalization_type: dynamic
my_audio_target_offset: 0.58

Parameter	Properties	Default	Type	Description
input_file	required			input file to pass to ffmpeg
audio_selector	optional	0:a:0	str	ffmpeg’s input selector to use to select the desired audio track
env_prefix	optional	my_audio_	str	prefix to use for environment variable names
integrated_loudness_target	optional	-24.0	str	integrated loudness target (I)
maximum_true_peak	optional	-2.0	str	maximum true peak (TP)
loudness_range_target	optional	7.0	str	loudness range target (LRA)
output	optional	{}	dict	output settings

ffmpeg:cmd¶

The latest stable release of FFmpeg command line executable is available with an exhaustive list of Codec and Library Support. Usually available libav dependencies are:

libavutil
libavcodec
libavformat
libavdevice
libavfilter
libavresample
libswscale
libswresample
libpostproc

NOTE: castLabs is not responsible for paying royalties which may occur for the use of Codecs

Full documentation of the tool can be found in: https://ffmpeg.org/ffmpeg-all.html

Commandline arguments and possible values must be presented as a list.

Arguments:

"-<arg1>", "-<val1>", "-<arg2>", "-<val2>", ...

NOTE: By default we execute the tool to stop in case any decoding error happens (-xerror). Sometimes such decoding errors can be ignored without problems, but this must be enforced with boolean “ignore_decoding_errors”.

Example:

{
  "tool": "ffmpeg:cmd",
  "parameters": {
    "arguments": [
      "-i",
      "tos_180s.mov",
      "-f",
      "mp4",
      "encode/tos.mp4"
    ],
    "ignore_decoding_errors": true
  }
},

Parameter	Properties	Default	Type
arguments	required [list value]
outputdir	optional	out	str
ignore_decoding_errors	optional	false	bool

Dolby¶

dolby_encoding:ves_muxer¶

Dolby Vision VES muxer is a command-line tool that combines a base-layer elementary stream and potentially an enhancement-layer elementary stream with Dolby Vision RPU metadata.

Full documentation: Dolby Encoding Engine Docs

Example:

{
  "tool": "dolby_encoding:ves_muxer",
  "parameters": {
        "input_bl": "video_in.h265",
        "input_rpu": "test.rpu",
        "overwrite": "1",
        "output": "video_out.h265"
    }
}

Parameter	Properties	Description
input_bl	required	Input base-layer file.
output	required	Output bitstream file.
input_el	optional	Input enhancement-layer file.
input_rpu	optional	Input RPU file.

dolby_encoding:post_process¶

Dolby Vision post-processor is a tool that post processes the muxed VES file produced by the VESMuxer tool.

Example:

{
  "tool": "dolby_encoding:post_process",
  "parameters": {
        "input": "video_in.h265",
        "dv_profile": "5",
        "overwrite": "1",
        "output": "video_out.h265"
    }
}

Parameter	Properties	Default	Type	Choice	Description
input	required				Input VES muxed file.
dv_profile	required				Dolby Vision bitstream profile ID. Values: 32.3
output	required				Output bitstream file.
input_bl	optional				Input base-layer file.This option is only available for Dolby Vision profile 7.
compression	optional	0	int	‘1’, ‘0’, 1, 0	RPU compression. Values: 0 means “no compression”, 1 means “compression by IDR”. This option is only available for Dolby Vision profile 32. Values: 0
update_l4	optional	1	int	‘1’, ‘0’, 1, 0	Update Dolby Vision level 4 metadata.
update_l6	optional	1	int	‘1’, ‘0’, 1, 0	Update Dolby Vision level 6 metadata.
max_cll	optional	auto	str		Maximum content light level and maximum frame average light level in units of 1 nit.
l1_filtering	optional	auto	str	‘auto’, ‘none’, ‘sliding’	Filter type for metadata postprocessing. ‘auto’ means the filter is selected automatically based on ‘dv-profile’ value.
l5	optional	auto	str		Target active area offset. Dolby Vision level 5 metadata represented by comma-separated integers. Each integer represents one of the letterbox’s margins in pixels: Syntax: ,,,
l11	optional	auto	str		Dolby Vision level 11 metadata represented by comma-separated integers: Syntax: ,,.
scale_metadata	optional	auto	str		Metadata scaling, from source resolution to target resolution
output_bl	optional				Output post-processed base-layer file. This option is only available for Dolby Vision profile 7.

dolby_encoding:dv5_preproc¶

Dolby Vision profile 5 pre-processor is a command-line tool that converts Dolby Vision mezzanine input to profile 5 raw video (YUV 420 10-bit) and RPU metadata.

Full documentation: Dolby Encoding Engine Docs

{
  "tool": "dolby_encoding:dv5_preproc",
  "parameters": {
        "decoder": "jpeg2000=kakadu_dlb:thread_num=4",
        "input_video": "media/video.mxf",
        "input_format": "jpeg2000_mxf",
        "temp_dir": "/tmp",
        "output_rpu": "/tmp/test.rpu"
        "output": "named_pipe://PIPE:buffer_size=256000:timeout=5000",
        "max_scene_frames": "96"
    }
}

Parameter	Properties	Default	Type	Choice	Description
input_format	required				Input format followed by format-specific options.
input_video	required				Input file or input directory, in case of list-based input.
output	required		str		Output YUV file (10-bit 420 planar).
output_rpu	required				Output RPU file
concurrent_processing	optional	1	int	‘1’, ‘0’, 1, 0	Enable concurrent processing.
show_frames	optional	none	str	‘all’, ‘counters’, ‘decoder’, ‘demuxer’, ‘none’, ‘resizer’, ‘transformer.’	Show more details about processed video frames.
input_metadata	optional				Optional input metadata file. If not specified, the application attempts to extract metadata from the input.
metadata_offset	optional	0	int		Offset added to each frame index, when accessing frame metadata from the source.
start	optional	0	int		Start frame.
duration	optional	-1	int		Duration (in frames). Value -1 means “up to the last frame”.
decoder	optional				Decoder plugin and decoder options to be used. If not specified, the application attempts to select plugin automatically.
resize_options	optional				Resize options. Syntax: “option1=value1:option2=value2”.
max_scene_frames	optional	255	int		Maximum number of frames processed as a single scene [1:255].
l11	optional	auto	str		Dolby Vision level 11 metadata.
l5	optional	auto	str		Dolby Vision level 5 metadata.
image_transformer	optional				Name of image transformer plugin to be used followed by configuration parameters.
temp_dir	optional				Directory to store temporary files.
keep_temp	optional	0	int	‘1’, ‘0’, 1, 0	Keep temporary files after execution.

Metadata¶

vorbis:cover_art¶

Embeds one or more JPEG or PNG images into an OGG file.

This is achieved by placing on or more base64 encoded binary FLAC picture structures within VorbisComment with tag names METADATA_BLOCK_PICTURE.

You can instruct the tool to remove existing METADATA_BLOCK_PICTURE comments or leave them alone.

IMAGES: list of dictionaries which need to have the following keys:

`input_img`     input image. Supported formats: PNG, JPEG
`img_type`      ID3v2 type. Can be one of:
- Other
- 32x32 pixels 'file icon' (PNG only)
- Other file icon
- Cover (front)
- Cover (back)
- Leaflet page
- Media (e.g. label side of CD)
- Lead artist/lead performer/soloist
- Artist/performer
- Conductor
- Band/Orchestra
- Composer
- Lyricist/text writer
- Recording Location
- During recording
- During performance
- Movie/video screen capture
- A bright coloured fish
- Illustration
- Band/artist logotype
- Publisher/Studio logotype

`description`   Optional description of the image

Example:

{
  "tool": "vorbis:cover_art",
  "parameters": {
    "input_ogg": "in_file.ogg",
    "output_ogg": "out_file.ogg",
    "remove_existing_pictures": true,
    "images": [
        {
            "input_img": "front_cover.jpg",
            "img_type": 3,
            "description": "Front cover of the album"
        },
        {
            "input_img": "back_cover.jpg",
            "img_type": 4,
        }
    ]
  }
}

Parameter	Properties	Default	Type	Description
images	required [list value]		dict	List of dictionaries representing keys, see IMAGES above
input_ogg	required		str	input ogg file to process
output_ogg	required		str	output ogg file to generate
remove_existing_pictures	optional	false	bool	removes existing `METADATA_BLOCK_PICTURE` comments

Subtitles¶

captions:convert¶

Captions and Subtitles Converter

Example:

{
    "parameters": {
        "inputfile": "IN/tearsofsteel_4k_eng.srt",
        "language": "eng",
        "outputformat": "WEBVTT",

        "outputfile": "tearsofsteel-test.vtt"
  },
  "tool": "captions:convert"
},

Parameter	Properties	Default	Type	Choice	Description
inputfile	required
outputfile	required
outputformat	required			‘WEBVTT’, ‘SAMI’, ‘DFXP’, ‘SRT’, ‘SCC’, ‘SST’
language	optional				set language tag for output (auto detection if supported by format)
offset	optional	PT0S	timedelta
force_input_language	optional				overwrite language tag for output if autodetection fails (if supported by format)
strip_html_tags	optional	false	bool		strip HTML tags when converting.
remove_layout_info	optional	false	bool		remove the layout information from subtitles. This ensures output subtitles will not have any layout info.
merge_captions	optional	false	bool		merge captions with the same start time.
merge_captions_layout	optional	false	bool		when merging captions, merge the layout as well. If true, the layout of the first caption will be used.
vtt_force_hours	optional	false	bool		when outputting to WEBVTT, force writing full timestamps (hh:mm:ss.xxx) even if the “hh” part is 00
vtt_sequence_numbers	optional	false	bool		include sequence numbers when writing WEBVTT format.
strip_ass_tags	optional	false	bool		strip ASS tags when converting.
video_width	optional		int		width of the video the subtitles will be displayed on. If specified, height is required too.
video_height	optional		int		height of the video the subtitles will be displayed on. If specified, width is required too.
position	optional	bottom	str	‘bottom’, ‘top’, ‘source’	specifies the position of subtitles. Currently limited to SST and SRT output subtitle types. Currently `source` is only supported for SST.
avoid_same_start_prev_end	optional	true	bool		avoid start time of a subtitle being the same as end time of previous one. Currently only supported when output format is SST
scenarist_compat	optional	false	bool		activate compatibility mode when writing SST format. Should produce more compatible files, but can break older systems.
tiff_compression	optional	tiff_deflate	str	‘tiff_deflate’, ‘raw’	TIFF compression to use when output format is SST.

captions:ccextractor¶

We deploy a forked version of this tool http://www.ccextractor.org:
https://github.com/encoreinteractive/ccextractor.git
Extracts closed captions and teletext subtitles from video streams.
(DVB, .TS, ReplayTV 4000 and 5000, dvr-ms, bttv, Tivo, Dish Network, .mp4, HDHomeRun are known to work).

Syntax: ccextractor [options] inputfile1 [inputfile2…] [-o outputfilename]

Commandline arguments must be presented as a list.

Parameters:

"<key>": "<value>"

Arguments:

"-<arg1>", "-<arg2>", ...

Example:

{
  "tool": "captions:ccextractor",
  "parameters": {
    "arguments": [
      "movie1.mp4",
      "-out=webvtt-full",
       "--webvtt-no-line",
       "--webvtt-no-css",
       "-o",
       "movie1-en.vtt"
    ]
  }
},

Parameter	Properties	Default	Type	Choice	Description
arguments	required [list value]

Thumbnails¶

thumbs:generate¶

Generates thumbnails from a video file to either include in packaging or for side loading.

Example:

{
  "tool": "thumbs:generate",
      "parameters": {
        "inputfile": "tos_5s_video_1920x1080_2mbps.mp4"
      }
},

Parameter	Properties	Default	Type	Description
inputfile	required			Video file to extract thumbnails.
duration	optional	5	int	Duration per thumbnail in seconds. 1 = 1 thumb per seconds, 10 = 1 thumb per 10 seconds
outputdir	optional	thumbnails	str	Output directory
height	optional	180	int
grid_width	optional	8	int
grid_height	optional	8	int
filename	optional	thumbs	str
quality	optional	60	int
skip_generating	optional	false	bool	Skip thumbnail generation. Assumes “outputdir” has thumbnails named using ffmpeg pattern %d.jpg

Transcode¶

transcode:cropdetect¶

Automatic video frame cropping detection based on ffmpeg commandline

Full documentation of the tool`s crop detection can be found here: ffmpeg - cropdetect

This implementation covers accurate detection of black bars across entire movies, with predefined parameters. The output output will be stored as environment variable “cropdetect” which can be used for cropping instructions in subsequent ffmpeg commands.

Example:

{
  "parameters": {
    "inputfile": "movie.mov"
  },
  "tool": "transcode:cropdetect"
},

Returns frame dimensions without black bars, width:height:columns_indent:rows_indent, for example:

crop=1920:1072:0:4

Example for cropping instructions with that output in a later ffmpeg command:

{
  "tool": "ffmpeg:cmd",
  "parameters": {
    "ignore_decoding_errors": "false",
    "arguments": [
      "-i",
      "{input}",
      "-filter_complex", ""[0:v]{cropdetect}[cropped];[cropped]scale='(trunc((ih*((16/9)/2))*2))':ih,setsar=1[cropped]"",
      "-map", "[cropped]
      "-f",
      "mp4",
      "{output}"
    ]
  }
},

Parameter	Properties	Default	Type	Choice	Description
inputfile	required

Watermarking¶

content_armor:ABIngestCL¶

ContentArmor’s ABIngest tool with modifications from castLabs.

These modifications include:

processing audio
creating proper output file names complying with the DASH-IF AB Ingest specification (i.e. suitable for Akamai).
support for additional DASH flavours
proper processing of the export file
creating playlists as for distribution also helper ones for VTK packaging jobs

The A/B Ingest tool consists of a profiler and an embedder.

The input to the tool is a video playlist (HLS/DASH, segmented of fragmented), the output:

pre-watermarked video A
pre-watermarked video B
output playlist which may be directly served via CDN
watermark forensic metadata (WFM) corresponding to the input video This file is used for online detection of watermarks.

You can read more about ContentArmor here: ContentArmor

Example:

{
  "tool": "content_armor:ABIngestCL",
  "parameters": {
        "input_file": "tos.m3u8",
        "output": "ABinv",
        "payload_size": 16,
        "content_name": "Tears of Steel",
        "store_location": "store/ABinv"
}

Parameter	Properties	Default	Type	Description
input_file	required			Input playlist or manifest (M3U8: variant or content playlist, MPD: type static)
output	required			Output folder
content_name	required			content identifier for a storage
payload_size	optional	16	int	Payload size of the mark in bits (4/8/16/24/32/48)
store_location	optional	store	str	Folder where extraction helper information will be stored
profiler_license	optional		str	Content-Armor profiler license
embedder_license	optional		str	Content-Armor embedder license

content_armor:embed¶

ContentArmor’s embedding tool.

The input to the tool is a profiled video file (forensic mode).

You can read more about ContentArmor here: ContentArmor

Example:

{
  "tool": "content_armor:embed",
  "parameters": {
        "input_file": "tos_profiled.mp4",
        "output_file": "tos_cafe.mp4",
        "wm_id": cafe,
        "license": {CA_EMB_LIC}
}

Parameter	Properties	Default	Type	Description
input_file	required			Profiled video file. (Supported: mp4, ts, 264, 265, avc, hevc, mkv)
output_file	required			Marked video file
wm_id	optional			Watermark ID in hex format without ‘0x’ i.e. 2a23
license	optional		str	Content-Armor embedder license
verbose	optional	false	bool	More verbose

content_armor:profile¶

ContentArmor’s Profiling tool.

The profiling tool prepares the content for embedding.

input and output are single video files.

You can read more about ContentArmor here: ContentArmor

Example:

{
  "tool": "content_armor:profile",
  "parameters": {
        "input_file": "tos.mp4",
        "output": "operator_inv",
        "payload_size": 16,
        "content_name": "Tears of Steel",
        "store_location": "store/operator_inv",
        "license": {CA_PROF_LIC}
}

Parameter	Properties	Default	Type	Description
input_file	required			Video file which conforms to ContentArmor specification, i.e. AVC/HEVC containing non-reference B frames (mp4, ts, 264, 265, avc, hevc, mkv). In ABR SEI aligned mode, the input file must be a DASH manifest or HLS playlist.
content_name	required			A name to be used to archive the processed asset
output_file	optional	profiled.mp4	str	profiled file name. Ignored in ABR SEI aligned mode
output_folder	optional	profiled	str	output folder for profiled file. Only used in ABR SEI aligned mode.
export_work	optional	export_work.tar.gz	str	extraction helper file
license	optional		str	Content-Armor profiler license
payload_size	optional	16	int	Payload size of the mark in bits (4/8/16/24/32/48)
embed_encrypt	optional	false	bool	Profile for embedding in the encrypted domain. Defaults to False
abr_sei_aligned	optional	false	bool	Activate ABR SEI aligned mode in Content Armor. Defaults to False

sffw:stardustmark_get_payload¶

Get payload(s) for use with ffmpeg’s sffwembedsafe filter.

Example:

{
  "tool": "sffw:stardustmark_get_payload",
  "parameters": {
    "wm_id": "4d2",
    "output_file": "payload.bin",
    "access_key_id": "urn:janus:accesskey:00000000000000000000000000000000",
    "secret_access_key": "00000000000000000000000000000000",
    "organization_urn": "urn:janus:organization:00000000000000000000000000000000",
    "user_urn": "urn:janus:user:00000000000000000000000000000000",
    "strength": 4,
    "bit_profile": 13,
    "sp_density": 60,
    "pixel_density": 60,
    "width": 1920,
    "height": 1080,
    "asset": {
        "title": "Fantasy Fly",
    },
    "editor": {
        "name": "John Doe",
        "email": "john.doe@studio.com"
    },
    "operation": {
        "timestamp": "2021-09-01T00:00:00Z",
        "location" : "40.659531, -74.047492",
        "name": "initial export"
    }
  }
},

Parameter	Properties	Default	Type	Choice	Description
wm_id	required		str		Watermark ID in HEX format
access_key_id	required		str		Access key ID
secret_access_key	required		str		Secret access key
user_urn	required		str		User URN
organization_urn	required		str		Organization URN
output_file	optional				Output file path (can include pattern for payload_switching, i.e. payload_%03d.pp)
strength	optional	4	int		Processing strength parameter
bit_profile	optional	13	int	8, 13, 16, 24, 32	Bit profile configuration
pixel_density	optional	60	int		Pixel density parameter
sp_density	optional	60	int		Superpixel density parameter
ecc_mode	optional	2	int	0, 2	EEC mode
payload_interval	optional	0	int		Interval in seconds for payload switching
deep_fake	optional	false	bool		Enable Deep-Fake detection support
width	optional	1920	int		Output width in pixels
height	optional	1080	int		Output height in pixels
asset	optional				information about the asset
editor	optional				information about the editor
operation	optional				information about the operation

Previous topic: DRMtoday

Next topic: File Transfer