Create embeddings

This quickstart guide provides a simplified introduction to generating text from video using the TwelveLabs Video Understanding Platform. It includes:

  • A basic working example
  • Minimal implementation details
  • Core parameters for common use cases

For a comprehensive guide, see the Create embeddings section.

Prerequisites

  • To use the platform, you need an API key:

    1

    If you don’t have an account, sign up for a free account.

    2

    Go to the API Key page.

    3

    Select the Copy icon next to your key.

  • Ensure the pre-release version of the TwelveLabs SDK is installed on your computer:

    $pip install twelvelabs --pre
  • The videos you wish to use must meet the following requirements:

    • Video resolution: Must be at least 360x360 and must not exceed 3840x2160.

    • Aspect ratio: Must be one of 1:1, 4:3, 4:5, 5:4, 16:9, 9:16, or 17:9.

    • Video and audio formats: Your video files must be encoded in the video and audio formats listed on the FFmpeg Formats Documentation page. For videos in other formats, contact us at support@twelvelabs.io.

    • Duration: Must be between 4 seconds and 2 hours (7,200s).

    • File size: Must not exceed 2 GB.
      If you require different options, contact us at support@twelvelabs.io.

  • The audio files you wish to use use must meet the following requirements:

    • Format: WAV (uncompressed), MP3 (lossy), and FLAC (lossless)
    • File size: Must not exceed 10MB.
  • The images you wish to use use must meet the following requirements:

    • Format: JPEG and PNG.
    • Dimension: Must be at least 128 x 128 pixels.
    • Size: Must not exceed 5MB.

Starter code

You can copy and paste the code below to create embeddings. Replace the placeholders surrounded by <> with your values.

1from typing import List
2
3from twelvelabs import TwelveLabs
4from twelvelabs.types import VideoSegment
5from twelvelabs.embed import TasksStatusResponse
6
7client = TwelveLabs(api_key="<YOUR_API_KEY>")
8
9task = client.embed.tasks.create(
10 model_name="Marengo-retrieval-2.7", video_url="<YOUR_VIDEO_URL>")
11print(f"Created video embedding task: id={task.id}")
12
13
14def on_task_update(task: TasksStatusResponse):
15 print(f" Status={task.status}")
16
17
18status = client.embed.tasks.wait_for_done(task.id, callback=on_task_update)
19print(f"Embedding done: {status.status}")
20
21task = client.embed.tasks.retrieve(
22 task_id=task.id, embedding_option=["visual-text", "audio"])
23
24
25def print_segments(segments: List[VideoSegment], max_elements: int = 5):
26 for segment in segments:
27 print(f" embedding_scope={segment.embedding_scope} embedding_option={segment.embedding_option} start_offset_sec={segment.start_offset_sec} end_offset_sec={segment.end_offset_sec}")
28 first_few = segment.float_[:max_elements]
29 print(
30 f" embeddings: [{', '.join(str(x) for x in first_few)}...] (total: {len(segment.float_)} values)"
31 )
32
33
34if task.video_embedding is not None and task.video_embedding.segments is not None:
35 print_segments(task.video_embedding.segments)

Step-by-step guide

1

Import the SDK and initialize the client

Create a client instance to interact with the TwelveLabs Video Understanding Platform.

2

Upload videos

To perform any downstream tasks, you must first upload your videos, and the platform must finish processing them.

3

Monitor the status

The platform requires some time to process videos. Check the status of the video embedding task until it’s completed.

4

Retrieve the embeddings

Once the platform has finished processing your video, you can retrieve the embeddings.

5

Process the results

This example iterates over the results and prints the key properties and a portion of the embedding vectors for each segment to the standard output.