Create sync embeddings

This endpoint synchronously creates embeddings for multimodal content and returns the results immediately in the response.

When to use this endpoint:

Create embeddings for text, images, audio, or video content
Retrieve immediate results without waiting for background processing
Process audio or video content up to 10 minutes in duration

Do not use this endpoint for:

Audio or video content longer than 10 minutes. Use the POST method of the /embed-v2/tasks endpoint instead.

Input requirements

Text:

Maximum length: 500 tokens

Images:

Formats: JPEG, PNG
Minimum size: 128x128 pixels
Maximum file size: 5 MB

Audio and video:

Maximum duration: 10 minutes
Maximum file size for base64 encoded strings: 36 MB
Audio formats: WAV (uncompressed), MP3 (lossy), FLAC (lossless)
Video formats: FFmpeg supported formats
Video resolution: 360x360 to 5184x2160 pixels
Aspect ratio: Between 1:1 and 1:2.4, or between 2.4:1 and 1:1

Note

This endpoint is rate-limited. For details, see the Rate limits page.

This endpoint synchronously creates embeddings for multimodal content and returns the results immediately in the response. **When to use this endpoint**: - Create embeddings for text, images, audio, or video content - Retrieve immediate results without waiting for background processing - Process audio or video content up to 10 minutes in duration **Do not use this endpoint for**: - Audio or video content longer than 10 minutes. Use the [`POST`](/v1.3/api-reference/create-embeddings-v2/create-async-embedding-task) method of the `/embed-v2/tasks` endpoint instead. <Accordion title="Input requirements"> **Text**: - Maximum length: 500 tokens **Images**: - Formats: JPEG, PNG - Minimum size: 128x128 pixels - Maximum file size: 5 MB **Audio and video**: - Maximum duration: 10 minutes - Maximum file size for base64 encoded strings: 36 MB - Audio formats: WAV (uncompressed), MP3 (lossy), FLAC (lossless) - Video formats: [FFmpeg supported formats](https://ffmpeg.org/ffmpeg-formats.html) - Video resolution: 360x360 to 5184x2160 pixels - Aspect ratio: Between 1:1 and 1:2.4, or between 2.4:1 and 1:1 </Accordion> <Note title="Note"> This endpoint is rate-limited. For details, see the [Rate limits](/v1.3/docs/get-started/rate-limits) page. </Note>

Authentication

x-api-keystring

Your API key.

Note

You can find your API key on the API Keys page.

Your API key. <Note title="Note"> You can find your API key on the <a href="https://playground.twelvelabs.io/dashboard/api-keys" target="_blank">API Keys</a> page. </Note>

Request

This endpoint expects an object.

input_typeenumRequired

The type of content for the embeddings.

Values:

audio: Creates embeddings for an audio file
video: Creates embeddings for a video file
image: Creates embeddings for an image file
text: Creates embeddings for text input
text_image: Creates embeddings for text and an image
multi_input: Creates a single embedding from up to 10 images. You can optionally include text to provide context. To reference specific images in your text, use placeholders in the following format: <@name>, where name matches the name field of a media source

The type of content for the embeddings. **Values**: - `audio`: Creates embeddings for an audio file - `video`: Creates embeddings for a video file - `image`: Creates embeddings for an image file - `text`: Creates embeddings for text input - `text_image`: Creates embeddings for text and an image - `multi_input`: Creates a single embedding from up to 10 images. You can optionally include text to provide context. To reference specific images in your text, use placeholders in the following format: `<@name>`, where `name` matches the `name` field of a media source

model_nameenumRequiredDefaults to marengo3.0

The video understanding model to use. Value: “marengo3.0”.

Allowed values:

textobjectOptional

This field is required if the input_type parameter is text.

imageobjectOptional

This field is required if the input_type parameter is image.

text_imageobjectOptional

This field is required if the input_type parameter is text_image.

audioobjectOptional

This field is required if the input_type parameter is audio.

videoobjectOptional

This field is required if the input_type parameter is video.

multi_inputobjectOptional

This field is required if the input_type parameter is multi_input.

Response

Successful request; normal operation

datalist of objects

Array of embedding results

metadataobject

Metadata for the media input. Available for image, text_image, audio, video, and multi_input inputs.

Errors

400

Bad Request Error

429

Too Many Requests Error

500

Internal Server Error

This endpoint synchronously creates embeddings for multimodal content and returns the results immediately in the response.

When to use this endpoint:

Create embeddings for text, images, audio, or video content
Retrieve immediate results without waiting for background processing
Process audio or video content up to 10 minutes in duration

Do not use this endpoint for:

Audio or video content longer than 10 minutes. Use the POST method of the /embed-v2/tasks endpoint instead.

Input requirements

Text:

Maximum length: 500 tokens

Images:

Formats: JPEG, PNG
Minimum size: 128x128 pixels
Maximum file size: 5 MB

Audio and video:

Maximum duration: 10 minutes
Maximum file size for base64 encoded strings: 36 MB
Audio formats: WAV (uncompressed), MP3 (lossy), FLAC (lossless)
Video formats: FFmpeg supported formats
Video resolution: 360x360 to 5184x2160 pixels
Aspect ratio: Between 1:1 and 1:2.4, or between 2.4:1 and 1:1

Note

This endpoint is rate-limited. For details, see the Rate limits page.

1	curl -X POST https://api.twelvelabs.io/v1.3/embed-v2 \
2	-H "x-api-key: <apiKey>" \
3	-H "Content-Type: application/json" \
4	-d '{
5	"input_type": "text",
6	"model_name": "marengo3.0",
7	"text": {
8	"input_text": "man walking a dog"
9	}
10	}'

1	{
2	"data": [
3	{
4	"embedding": [
5	0.111,
6	0.234,
7	-0.567,
8	0.89
9	]
10	}
11	]
12	}