Video Understanding
Introduction
The Video Understanding feature allows users to leverage advanced tools for analyzing video content. All heavy lifting, including video preprocessing and processing, is handled by TwelveLabs.
Using Video Understanding feature
Video Understanding outputs can be created by adding a 'video_understanding' output format to your transcoding job. In order to do this, use the /v1/start_encode2 method to launch a transcoding job with the output param set to video_understanding.
You can specify the name of an output file by result_name optional parameter. If not specified it is result.json for default.
General Structure:
{
"query": {
"encoder_version": "2",
"format": [
{
"output": "video_understanding",
"mode": "MODE_NAME"
}
],
"source": "VIDEO_URL"
}
}“MODE_NAME” - the mode in which you are going to apply video processing.
- On the Transcode Media page choose Video Understanding as an output format.
- Fill the File Name field if you want to specify the output file name. It is
result.jsonfor default. - Output Path value is used as a folder name.

Available Modes
| Mode | Code | Description |
|---|---|---|
| Description Mode | 'description' | Generates a comprehensive description of video content |
| Categorization Mode | 'categorization' | Provides a list of categories based on video content |
| Content Moderation Mode | 'moderation' | Evaluates whether a video violates content guidelines and provides reasons if applicable |
| Custom Mode | 'custom' | Allows for results based on user-defined prompts |
| Search Mode | 'search' | Searches for specific queries within the video content |
Description Mode
To apply Description mode set the mode parameter to 'description'.
Request Example:
{
"query": {
"encoder_version": "2",
"format": [
{
"output": "video_understanding",
"mode": "description"
}
]
}
}Response Example:
{
"description": "The description of your video."
}Description mode is setted for default on the Output form.

Categorization Mode
To apply Categorization mode set the mode parameter to 'categorization'.
Optional parameter:
- categories — list of custom categories. If not provided, default categories are used.
Request Example:
{
"query": {
"encoder_version": "2",
"format": [
{
"output": "video_understanding",
"mode": "categorization",
"categories": [
"IT",
"cloud",
"education"
]
}
]
}
}Response Example:
{
"categories": [
"Technology",
"Business",
"Education"
]
}- Choose the Categorization mode by the Mode selector.
- Provide custom categories by the “Add Category” button. If leave this list empty, default categories are used.

Content Moderation Mode
To apply Content Moderation mode set the mode parameter to 'moderation'.
Optional parameter:
- violation_reasons — list of custom violation reasons. If not provided, default reasons are used.
Request Example:
{
"query": {
"encoder_version": "2",
"format": [
{
"output": "video_understanding",
"mode": "moderation",
"violation_reasons": [
"Fight",
"violence",
"weapons"
]
}
]
}
}Response Example:
{
"violates": true,
"reasons": [
"Sexual Content",
"Violence"
]
}- Choose the Moderation mode by the Mode selector.
- Provide custom violation reasons by the “Add Violation Reason” button. If leave this list empty, default violation reasons are used.

Custom Mode
To apply Custom mode, set the mode parameter to 'custom'.
Required parameter:
- prompt — user-defined instruction for the model.
Request Example:
{
"query": {
"encoder_version": "2",
"format": [
{
"output": "video_understanding",
"mode": "custom",
"prompt": "Extract the license plate number on the car."
}
]
}
}Response Example:
{
"result": "The license plate number on the red car is AM-84865."
}- Choose the Custom mode by the Mode selector.
- Enter prompt on the Prompt field. This field is required.

Search Mode
To apply Search mode, set the mode parameter to 'search'.
At least one of the following parameters must be provided:
- prompt — text query.
- media_prompt — direct URL to PNG or JPEG image.
Optional parameters:
- search_rank_threshold - limit of the number of returned results.
- search_options (any combination of "visual", "audio", "transcription") - the part of video to search in.
Request Example:
{
"query": {
"encoder_version": "2",
"format": [
{
"output": "video_understanding",
"mode": "search",
"prompt": "Man talking",
"search_rank_threshold": 3,
"search_options": [
"visual",
"audio"
]
}
]
}
}Response Example:
{
"result": [
{
"start": 3,
"end": 9,
"rank": 1
},
{
"start": 12,
"end": 19,
"rank": 2
},
{
"start": 24,
"end": 30,
"rank": 3
}
]
}- Choose the Search mode by the Mode selector.
- Enter prompt on the Prompt field or provide the media prompt link on the Media Prompt field. At least one of these fields must be filled.
- Choose the search options. If not choosen, all search options will be used.
- Specify the rank threshold. If not specified the all results will be provided in output file.

Requirements for Video Upload
Before using the Video Understanding feature, ensure your video meets the following requirements based on the mode you choose:
Description, Categorization, Content Moderation and Custom modes:
- Format: Must be a valid FFmpeg supported format.
- Size: Less than 2 GB.
- Duration: Between 10 and 7200 seconds.
- Resolution: Between 360x360 and 5184x2160.
- Aspect Ratio: Between 1:1 and 2.4:1.
Search mode:
- Format: Must be a valid FFmpeg supported format.
- Size: Less than 4 GB.
- Duration: Between 4 and 14400 seconds.
- Resolution: Between 360x360 and 5184x2160.
- Aspect Ratio: Between 1:1 and 2.4:1.