> ## Documentation Index
> Fetch the complete documentation index at: https://developers.telnyx.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Embeddings

> Generate vector embeddings with the Telnyx Inference API for semantic search, clustering, and RAG. Supports multiple embedding models and batch requests.

In this tutorial, you'll learn how to:

* Upload documents to [Telnyx Storage](https://telnyx.com/products/cloud-storage)
* Transform these documents into embeddings, enabling a language model to retrieve relevant sections of your documents
* Provide the storage bucket as context for the language model

## Upload your documents

You can upload objects to Telnyx's S3-Compatible storage API using our [quickstart](https://developers.telnyx.com/docs/cloud-storage/quick-start) or with our [drag-and-drop interface in the portal](https://portal.telnyx.com/#/storage/buckets).

## Embed your documents

Once you've uploaded your documents, you can [embed them via API](https://developers.telnyx.com/api-reference/embeddings/embed-url-content#embed-url-content) or by clicking the "Embed for AI Use" button in the portal while viewing your storage bucket's contents.

Behind the scenes, your documents will be processed into sections and each section will be "embedded" based on its contents. Later, when a user asks a language model a question, it will automatically be provided with the most relevant sections of documents from the bucket to help answer the question.

## Chat over your documents

Once your documents are embedded, you can try it out in the [AI Playground in the portal](https://portal.telnyx.com/#/ai/playground) by selecting your embedded bucket from the storage dropdown.

You can also use embeddings via our [chat completions API](https://developers.telnyx.com/api-reference/openai-chat/create-a-chat-completion-openai-compatible). Here is a Python example.

<Note>
  Make sure you have set the `TELNYX_API_KEY` environment variable. Also, update the `question` and `bucket` variables in the sample code.
</Note>

```python theme={null}
import os

from openai import OpenAI

client = OpenAI(
    api_key=os.getenv("TELNYX_API_KEY"),
    base_url="https://api.telnyx.com/v2/ai/openai",
)

question = "<ADD QUESTION HERE>"
bucket = "<ADD EMBEDDED BUCKET HERE>"
chat_completion = client.chat.completions.create(
  messages=[
    {
        "role": "user",
        "content": question
    }
  ],
  model="zai-org/GLM-5.1-FP8",
  stream=True,
  tools=[
    {
        "type": "retrieval",
        "retrieval": {
            "bucket_ids": [bucket]
        }
    }
  ]
)

for chunk in chat_completion:
  if chunk.choices[0].delta.content:
    print(chunk.choices[0].delta.content, end="", flush=True)
```
