On this page

Video Understanding

Introduction

The Video Understanding feature allows users to leverage advanced tools for analyzing video content. All heavy lifting, including video preprocessing and processing, is handled by TwelveLabs.

Using Video Understanding feature

Video Understanding outputs can be created by adding a 'video_understanding' output format to your transcoding job. In order to do this, use the /v1/start_encode2 method to launch a transcoding job with the output param set to video_understanding.

You can specify the name of an output file by result_name optional parameter. If not specified it is result.json for default.

General Structure:

{
 "query": {
 "encoder_version": "2",
 "format": [
 {
 "output": "video_understanding",
 "mode": "MODE_NAME"
 }
 ],
 "source": "VIDEO_URL"
 }
}

“MODE_NAME” - the mode in which you are going to apply video processing.

  1. On the Transcode Media page choose Video Understanding as an output format.
  2. Fill the File Name field if you want to specify the output file name. It is result.json for default.
  3. Output Path value is used as a folder name.
video understanding output

Available Modes

ModeCodeDescription
Description Mode'description'Generates a comprehensive description of video content
Categorization Mode'categorization'Provides a list of categories based on video content
Content Moderation Mode'moderation'Evaluates whether a video violates content guidelines and provides reasons if applicable
Custom Mode'custom'Allows for results based on user-defined prompts
Search Mode'search'Searches for specific queries within the video content

Description Mode

To apply Description mode set the mode parameter to 'description'.

Request Example:

{
  "query": {
    "encoder_version": "2",
    "format": [
      {
        "output": "video_understanding",
        "mode": "description"
      }
    ]
  }
}

Response Example:

{
  "description": "The description of your video."
}

Description mode is setted for default on the Output form.

description mode

Categorization Mode

To apply Categorization mode set the mode parameter to 'categorization'.

Optional parameter:

  • categories — list of custom categories. If not provided, default categories are used.

Request Example:

{
  "query": {
    "encoder_version": "2",
    "format": [
      {
        "output": "video_understanding",
        "mode": "categorization",
        "categories": [
          "IT",
          "cloud",
          "education"
        ]
      }
    ]
  }
}

Response Example:

{
  "categories": [
    "Technology",
    "Business",
    "Education"
  ]
}
  1. Choose the Categorization mode by the Mode selector.
  2. Provide custom categories by the “Add Category” button. If leave this list empty, default categories are used.
categorization mode

Content Moderation Mode

To apply Content Moderation mode set the mode parameter to 'moderation'.

Optional parameter:

  • violation_reasons — list of custom violation reasons. If not provided, default reasons are used.

Request Example:

{
  "query": {
    "encoder_version": "2",
    "format": [
      {
        "output": "video_understanding",
        "mode": "moderation",
        "violation_reasons": [
          "Fight",
          "violence",
          "weapons"
        ]
      }
    ]
  }
}

Response Example:

{
  "violates": true,
  "reasons": [
    "Sexual Content",
    "Violence"
  ]
}
  1. Choose the Moderation mode by the Mode selector.
  2. Provide custom violation reasons by the “Add Violation Reason” button. If leave this list empty, default violation reasons are used.
moderation mode

Custom Mode

To apply Custom mode, set the mode parameter to 'custom'.

Required parameter:

  • prompt — user-defined instruction for the model.

Request Example:

{
  "query": {
    "encoder_version": "2",
    "format": [
      {
        "output": "video_understanding",
        "mode": "custom",
        "prompt": "Extract the license plate number on the car."
      }
    ]
  }
}

Response Example:

{
  "result": "The license plate number on the red car is AM-84865."
}
  1. Choose the Custom mode by the Mode selector.
  2. Enter prompt on the Prompt field. This field is required.
custom mode

Search Mode

To apply Search mode, set the mode parameter to 'search'.

At least one of the following parameters must be provided:

  • prompt — text query.
  • media_prompt — direct URL to PNG or JPEG image.

Optional parameters:

  • search_rank_threshold - limit of the number of returned results.
  • search_options (any combination of "visual", "audio", "transcription") - the part of video to search in.

Request Example:

{
  "query": {
    "encoder_version": "2",
    "format": [
      {
        "output": "video_understanding",
        "mode": "search",
        "prompt": "Man talking",
        "search_rank_threshold": 3,
        "search_options": [
          "visual",
          "audio"
        ]
      }
    ]
  }
}

Response Example:

{
  "result": [
    {
      "start": 3,
      "end": 9,
      "rank": 1
    },
    {
      "start": 12,
      "end": 19,
      "rank": 2
    },
    {
      "start": 24,
      "end": 30,
      "rank": 3
    }
  ]
}
  1. Choose the Search mode by the Mode selector.
  2. Enter prompt on the Prompt field or provide the media prompt link on the Media Prompt field. At least one of these fields must be filled.
  3. Choose the search options. If not choosen, all search options will be used.
  4. Specify the rank threshold. If not specified the all results will be provided in output file.
search mode

Requirements for Video Upload

Before using the Video Understanding feature, ensure your video meets the following requirements based on the mode you choose:

Description, Categorization, Content Moderation and Custom modes:

  • Format: Must be a valid FFmpeg supported format.
  • Size: Less than 2 GB.
  • Duration: Between 10 and 7200 seconds.
  • Resolution: Between 360x360 and 5184x2160.
  • Aspect Ratio: Between 1:1 and 2.4:1.

Search mode:

  • Format: Must be a valid FFmpeg supported format.
  • Size: Less than 4 GB.
  • Duration: Between 4 and 14400 seconds.
  • Resolution: Between 360x360 and 5184x2160.
  • Aspect Ratio: Between 1:1 and 2.4:1.
note
Note
Source field with the Video Understanding output must be only http(s) url.