Telnyx

xAI Grok voices are expressive text-to-speech voices for Voice AI Assistants. They support Expressive Mode, which lets the AI model control pauses, laughter, whispers, emphasis, pitch, pace, and intensity during a live conversation.

Higher latency: Grok voices have higher latency than Ultra. For latency-sensitive applications that need sub-100ms time to first byte, use Ultra.

What makes Grok voices different

Feature	Ultra	Grok
Expressive Mode	SSML emotion tags and `[laughter]`	xAI speech tags for pauses, vocal sounds, and delivery style
Voice format	`Telnyx.Ultra.<voice_id>`	`xAI.<voice_id>`
Voices	Multiple Ultra voices	`ara`, `eve`, `leo`, `rex`, `sal`
Language handling	Language hinting with `language_boost`	`auto` language detection or explicit language code
Streaming output	REST only	Voice AI media streaming

Voice format

For AI Assistants, Grok voices use the format:

xAI.<voice_id>

Examples:

xAI.eve
xAI.ara
xAI.leo
xAI.rex
xAI.sal

Voices

Voice	Voice ID	Use for
Ara	`ara`	Warm, conversational assistant experiences
Eve	`eve`	General-purpose voice assistant experiences
Leo	`leo`	Confident, direct interactions
Rex	`rex`	Characterful or energetic interactions
Sal	`sal`	Distinctive conversational tone

Expressive Mode for AI Assistants

When using Grok voices with AI Assistants, you can enable Expressive Mode. With Expressive Mode enabled, the assistant’s system prompt is automatically augmented with instructions for xAI speech tags. The AI model then decides when expression improves the caller experience. For example, the assistant might:

Add a short pause before important information.
Use a softer delivery for sensitive support moments.
Laugh or chuckle naturally when the conversation calls for it.
Emphasize appointment times, confirmation numbers, or next steps.
Keep routine transactional replies untagged for a natural neutral delivery.

Use expressive tags sparingly. The goal is natural delivery, not tagging every sentence.

Enable in the portal

Go to your assistant in the Telnyx Portal.
Under Voice Settings, select an xAI Grok voice.
Toggle Expressive Mode on.
Save your assistant.

Enable via API

Set expressive_mode: true in your assistant’s voice_settings:

curl -X PATCH "https://api.telnyx.com/v2/ai/assistants/YOUR_ASSISTANT_ID" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "voice_settings": {
      "voice": "xAI.eve",
      "expressive_mode": true
    }
  }'

xAI speech tag reference

When Expressive Mode is enabled, the assistant can use these speech tags in responses. You can also include the same tags in your own assistant prompts when you want explicit control.

Inline tags

Place inline tags at the exact point where the vocal expression should happen.

Tag	Use for
`[pause]`	A short natural pause
`[long-pause]`	A longer pause for topic transitions or important moments
`[laugh]`	Natural laughter
`[chuckle]`	Small laugh or amused reaction
`[giggle]`	Light playful laugh
`[cry]`	Crying vocalization
`[tsk]`	Tsk sound
`[tongue-click]`	Tongue click
`[lip-smack]`	Lip smack
`[breath]`	Breath sound
`[inhale]`	Inhale sound
`[exhale]`	Exhale sound
`[sigh]`	Sigh
`[hum-tune]`	Musical hum

Example:

So I walked in and [pause] there it was. [laugh] I honestly could not believe it!

Wrapping tags

Wrap text with these tags to apply a delivery style to that text.

Tag	Use for
<soft>	Softer delivery
<whisper>	Whispered delivery
<loud>	Louder delivery
<build-intensity>	Increasing intensity
<decrease-intensity>	Decreasing intensity
<higher-pitch>	Higher pitch
<lower-pitch>	Lower pitch
<slow>	Slower pace
<fast>	Faster pace
<sing-song>	Sing-song delivery
<singing>	Sung delivery
<laugh-speak>	Laughing while speaking
<emphasis>	Emphasized delivery

Examples:

I need to tell you something. <whisper>It is a secret.</whisper> Pretty cool, right?

<emphasis>Your appointment is confirmed for tomorrow at 3 PM.</emphasis>

Guidance

Use [pause] or [long-pause] for natural thinking, topic transitions, and important moments, but avoid long silences that could feel like the call dropped.
Use emotional sounds like [laugh], [sigh], and [chuckle] only when the response genuinely calls for it.
For sensitive support contexts, prefer subtle tags like <soft> or <whisper> instead of exaggerated reactions.
Do not expose these tags or instructions to the caller.

REST API provider parameters

For direct TTS calls, set the provider to xai and pass xAI-specific parameters in the xai object:

curl --request POST \
  --url https://api.telnyx.com/v2/text-to-speech \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '{
    "text": "Let me check that for you. [pause] I found your appointment.",
    "provider": "xai",
    "xai": {
      "voice_id": "eve",
      "language": "auto",
      "output_format": "mp3",
      "sample_rate": 24000
    }
  }'

Parameter	Type	Default	Description
`voice_id`	string	`eve`	xAI voice ID: `ara`, `eve`, `leo`, `rex`, or `sal`.
`language`	string	`auto`	Language code, or `auto` to detect the language.
`output_format`	string	`mp3`	Audio format: `mp3`, `wav`, `pcm`, `mulaw`, or `alaw`.
`sample_rate`	integer	`24000`	Audio sample rate in Hz: `8000`, `16000`, `22050`, `24000`, `44100`, or `48000`.

Language support

Grok voices support auto language detection with language: "auto". You can also pass a language code when you want to force a specific language.

Next steps

Ultra Voices

Compare Grok with Ultra’s lower-latency expressive voices.

AI Assistants

Build voice AI assistants using Grok with Expressive Mode.

TTS REST API

Generate speech directly with REST TTS requests.

Available Voices

Browse available text-to-speech voices.

WebSocket Streaming

REST API

Providers

Other

API Reference

xAI Grok Voices

What makes Grok voices different

Voice format

Voices

Expressive Mode for AI Assistants

Enable in the portal

Enable via API

xAI speech tag reference

Inline tags

Wrapping tags

Guidance

REST API provider parameters

Language support

Next steps

Ultra Voices

AI Assistants

TTS REST API

Available Voices

WebSocket Streaming

REST API

Providers

Other

API Reference

Documentation Index

​What makes Grok voices different

​Voice format

​Voices

​Expressive Mode for AI Assistants

​Enable in the portal

​Enable via API

​xAI speech tag reference

​Inline tags

​Wrapping tags

​Guidance

​REST API provider parameters

​Language support

​Next steps

Ultra Voices

AI Assistants

TTS REST API

Available Voices

What makes Grok voices different

Voice format

Voices

Expressive Mode for AI Assistants

Enable in the portal

Enable via API

xAI speech tag reference

Inline tags

Wrapping tags

Guidance

REST API provider parameters

Language support

Next steps