Create embeddings for text, image, and audio

<Note title="Note"> This endpoint will be deprecated in a future version. Migrate to the [Embed API v2](/v1.3/api-reference/create-embeddings-v2) for continued support and access to new features. </Note> This method creates embeddings for text, image, and audio content. Ensure your media files meet the following requirements: - [Audio files](/v1.3/docs/concepts/models/marengo#audio-requirements). - [Image files](/v1.3/docs/concepts/models/marengo#image-requirements). Parameters for embeddings: - **Common parameters**: - `model_name`: The video understanding model you want to use. Example: "marengo3.0". - **Text embeddings**: - `text`: Text for which to create an embedding. - **Image embeddings**: Provide one of the following: - `image_url`: Publicly accessible URL of your image file. - `image_file`: Local image file. - **Audio embeddings**: Provide one of the following: - `audio_url`: Publicly accessible URL of your audio file. - `audio_file`: Local audio file. <Note title="Notes"> - The Marengo video understanding model generates embeddings for all modalities in the same latent space. This shared space enables any-to-any searches across different types of content. - You can create multiple types of embeddings in a single API call. - Audio embeddings combine generic sound and human speech in a single embedding. For videos with transcriptions, you can retrieve transcriptions and then [create text embeddings](/v1.3/api-reference/create-embeddings-v1/text-image-audio-embeddings/create-text-image-audio-embeddings) from these transcriptions. </Note>

Authentication

x-api-keystring
Your API key. <Note title="Note"> You can find your API key on the <a href="https://playground.twelvelabs.io/dashboard/api-key" target="_blank">API Key</a> page. </Note>

Request

Request to create an embedding synchronously.
model_namestringRequired
The name of the model you want to use. The following models are available: - `marengo3.0`: Enhanced model with sports intelligence and extended content support. For a list of the new features, see the [New in Marengo 3.0](/v1.3/docs/concepts/models/marengo#new-in-marengo-30) section. - `Marengo-retrieval-2.7`: Video embedding model for multimodal search.
textstringOptional

The text for which you wish to create an embedding.

Example: “Man with a dog crossing the street”

text_truncatestringOptionalDefaults to end
Specifies how the platform handles text that exceeds token limits. **Available options by model version**: **Marengo 3.0**: This parameter is deprecated. The platform automatically truncates text exceeding 500 tokens from the end. **Marengo 2.7**: Specifies truncation method for text exceeding 77 tokens: - `start`: Removes tokens from the beginning - `end`: Removes tokens from the end (default) - `none`: Returns an error if the text is longer than the maximum token limit. **Default**: `end`
image_urlstringOptionalformat: "uri"

The publicly accessible URL of the image for which you wish to create an embedding. This parameter is required for image embeddings if image_file is not provided.

image_filefileOptional

The image file for which you wish to create an embedding as a local file. This parameter is required for image embeddings if image_url is not provided.

audio_urlstringOptionalformat: "uri"

The publicly accessible URL of the audio file for which you wish to creae an emebdding. This parameter is required for audio embeddings if audio_file is not provided.

audio_filefileOptional

The audio file for which you wish to create an embedding as a local file. This parameter is required for audio embeddings if audio_url is not provided.

audio_start_offset_secdoubleOptionalDefaults to 0

Specifies the start time, in seconds, from which the platform generates the audio embeddings. This parameter allows you to skip the initial portion of the audio during processing. Default: 0.

Response

A text embedding has successfully been created.
model_namestring
The name of the video understanding model the platform has used to create this embedding.
text_embeddingobject or null
An object that contains the generated text embedding vector and associated information. Present when text was processed.
image_embeddingobject or null
An object that contains the generated image embedding vector and associated information. Present when image was processed.
audio_embeddingobject or null
An object that contains the generated audio embedding vector and associated information. Present when audio was processed.

Errors