Create embeddings
Use the Marengo video understanding model to generate embeddings from video, audio, text, and image inputs. These embeddings enable similarity search, content clustering, recommendation systems, and other machine learning applications.
Model specification
The model has two types of limits: the maximum input size you can submit and the portion of content that it embeds.
Input requirements
This table shows the maximum size for each type of input:
Embedding coverage per input type
This table shows what portion of your input the model processes into embeddings:
Pricing
For details on pricing, see the Amazon Bedrock pricing page.
Choose the processing method
Select the processing method based on your use case and performance requirements. Synchronous processing returns embeddings immediately in the API response, while asynchronous processing handles larger files and batch operations by saving results to S3.
Note
Synchronous processing supports text and image inputs only. For video, audio, and large-scale image files, use asynchronous processing.
Use synchronous processing to:
- Build real-time applications like chatbots, search, and recommendation systems.
- Enable interactive features that require immediate results.
Use asynchronous processing to:
- Build applications that process video, audio, and large-scale image files.
- Run batch operations and background workflows.
Prerequisites
Before you start, ensure you have the following:
- An AWS account with access to a region where the TwelveLabs models are supported.
- An AWS IAM principal with sufficient Amazon Bedrock permissions. For details on setting permissions, see the Identity and access management for Amazon Bedrock page.
- S3 permissions to read input files and write output files for Marengo operations.
- The AWS CLI and configured with your credentials.
- Python 3.7 or later with the
boto3library. - Access to the model you want to use. Navigate to the AWS Console > Bedrock > Model Access page and request access. Note that the availability of the models varies by region.
Create embeddings
Marengo supports base64 encoded strings and S3 URIs for media input. Note that the base64 method has a 36MB file size limit. This guide uses S3 URIs.
Note
Your S3 input and output buckets must be in the same region as the model. If regions don’t match, the API returns a ValidationException error.
To generate embeddings from your content, you use one of two Amazon Bedrock APIs, depending on your processing needs.
Synchronous processing
The InvokeModel API processes your request synchronously and returns embeddings directly in the response.
The InvokeModel API requires two parameters:
modelId: The inference profile ID for the model.body: A JSON-encoded string containing your input parameters.
The request body contains the following fields:
inputType: The type of content. Values: “text” or “image”.inputText: The text to embed. This parameter is required for text inputs.mediaSource: The image source, which contains either:base64String: Your base64-encoded image for inline processings3Location: The S3 location for images stored in S3.
Example
Text
Image
Ensure you replace <YOUR_TEXT> with the text for which you wish to create an embedding (example: “A man walking down the street”).
Asynchronous processing
The StartAsyncInvoke API processes your request asynchronously, storing the results in your S3 bucket.
To create embeddings asynchronously, you must complete the following steps:
The StartAsyncInvoke API requires three parameters:
modelId: The model ID.modelInput: A dictionary containing your input parameters.outputDataConfig: A dictionary specifying where to save the results
The modelInput dictionary contains the following fields:
inputType: The type of content you’re embedding (“video”, “audio”, “image”, or “text”)mediaSource: The S3 location of your input file (for video, audio, and image)inputText: The text content (for text inputs only)
S3 output structure
Each invocation creates a unique directory in your S3 bucket with two files:
manifest.json: Contains metadata including the request ID.output.json: Contains the actual embeddings.
Example
Video, audio, or image inputs
Text inputs
Ensure you replace the following placeholders with your values:
<YOUR_REGION>: with your AWS region (example: “eu-west-1”)<YOUR_ACCOUNT_ID>: with your AWS account ID (example: “123456789012”)<YOUR_BUCKET_NAME>: with the name of your S3 bucket (example: “my-bucket”)<YOUR_FILE>: with the name of your video file (example: “my_file.mp4”)<YOUR_INPUT_TYPE>: with the type of media you wish to provide. The following values are supported: “video”, “audio”, or “image”.
Use embeddings
After generating embeddings, you can store them in a vector database for efficient similarity search and retrieval.
The typical workflow is as follows:
Request parameters and response fields
For a complete list of request parameters and response fields, see the TwelveLabs Marengo Embed 2.7 page in the Amazon Bedrock documentation.