Use this file to discover all available pages before exploring further.
SMS messages are encoded into segments of 140 bytes each. You are billed per segment, so understanding encoding is key to controlling costs.The encoding determines how many characters fit in each segment:
Encoding
Bits per char
Single segment
Multi-part segment
GSM 7-bit
7
160 chars
153 chars
ASCII 7-bit
7
160 chars
153 chars
ASCII 8-bit
8
140 chars
134 chars
UTF-16
16
70 chars
67 chars
A single non-GSM-7 character (like an emoji or curly quote) switches the entire message to UTF-16, cutting capacity from 160 to 70 characters per segment. This can more than double your costs.
Every SMS message is transmitted in units of 140 bytes. When a message exceeds one segment, a 6-byte header (User Data Header, or UDH) is added to each segment for reassembly, reducing the usable space.
Single segment: 140 bytes available → 160 GSM-7 chars or 70 UTF-16 charsMulti-part: 134 bytes per segment → 153 GSM-7 chars or 67 UTF-16 charsMaximum: 10 segments per message
Enable smart encoding to automatically replace common Unicode characters (like curly quotes and em dashes) with GSM-7 equivalents, reducing segment counts.
Telnyx uses a GSM 7-bit encoding optimized for maximum carrier compatibility. Only characters in this set will keep your message in the efficient GSM-7 encoding.
Standard characters (1 character each)
Letters:
A B C D E F G H I J K L M N O P Q R S T U V W X Y Za b c d e f g h i j k l m n o p q r s t u v w x y z
Digits:
0 1 2 3 4 5 6 7 8 9
Symbols and punctuation:
! " # $ % & ' ( ) * + , - . / : ; < = > ? @
Special characters:
Character
Description
space
Space
\n
Line feed
\r
Carriage return
_
Underscore
£
Pound sign
¥
Yen sign
è
e grave
é
e acute
ù
u grave
ì
i grave
ò
o grave
Ø
O with stroke
ø
o with stroke
Å
A with ring
å
a with ring
Æ
AE ligature
æ
ae ligature
ß
Sharp s
É
E acute
¡
Inverted exclamation
Ä
A umlaut
Ö
O umlaut
Ñ
N tilde
Ü
U umlaut
§
Section sign
¿
Inverted question
ä
a umlaut
ö
o umlaut
ñ
n tilde
ü
u umlaut
à
a grave
Extended characters (2 characters each)
These characters require an escape sequence and count as 2 characters in segment calculations:
Character
Description
Character count
~
Tilde
2
^
Circumflex
2
|
Pipe / vertical bar
2
\
Backslash
2
{
Left curly bracket
2
}
Right curly bracket
2
[
Left square bracket
2
]
Right square bracket
2
€
Euro sign
2
Extended characters are easy to overlook when estimating segment counts. A message with 155 standard characters and 3 pipe characters (|) uses 155 + (3 × 2) = 161 character slots, requiring 2 segments instead of 1.
Or manually replace them with GSM-7 equivalents before sending
Emojis dramatically increase segment count
Symptom: Adding a single emoji doubles or triples the number of segments.Cause: Emojis force UTF-16 encoding (70 chars/segment instead of 160). Additionally, most emojis use surrogate pairs and count as 2 UTF-16 characters.Example:
"Thanks for your order!" → GSM-7, 1 segment (22 chars)"Thanks for your order! 🎉" → UTF-16, 1 segment (25 chars)"Thanks for your order! ... 🎉" → UTF-16, 2 segments (71+ chars)
Fix: If cost is a concern, avoid emojis in SMS. Use emojis freely in MMS/RCS where encoding isn’t a factor.
Extended GSM-7 characters cause unexpected segment splits
Symptom: A 155-character message that looks like it should fit in one segment actually requires two.Cause: Characters like [, ], {, }, |, \, ^, ~, and € are in the GSM-7 extended set and count as 2 characters each.Example:
"Price: $100 [USD]" → 18 visible chars but 20 GSM-7 chars ([ and ] each cost 2)
Fix: Account for extended characters when calculating message length. Use the segment calculator above or the SDK helpers in this guide.
Copy-pasted text from Word/Google Docs causes issues
Symptom: Text that looks like normal ASCII actually contains Unicode characters.Cause: Word processors often replace straight quotes with curly quotes, hyphens with em dashes, and three periods with an ellipsis character. These are invisible differences that force UTF-16.Fix:
Enable smart encoding — this handles the most common substitutions automatically
Sanitize text before sending by replacing known problem characters
Use the encoding parameter set to gsm7 to get a 400 error if non-GSM-7 characters are present (fail-fast approach)
Messages truncated or split incorrectly on recipient's phone
Symptom: The recipient sees a message split in unexpected places, or parts arrive out of order.Cause: Multi-part messages are reassembled by the recipient’s device using the UDH (User Data Header). Some older devices or carriers may not support reassembly for messages over a certain number of segments.Fix:
Keep messages under 3-4 segments for maximum compatibility
Telnyx supports up to 10 segments, but recipient device support varies
Consider using MMS for longer content
Non-Latin scripts (Chinese, Arabic, Cyrillic) use too many segments
Symptom: Messages in non-Latin scripts use significantly more segments than English messages of similar visible length.Cause: Non-Latin characters have no GSM-7 equivalents, so the entire message uses UTF-16 encoding (70 characters per segment). Smart encoding cannot help here.Fix:
This is expected behavior — plan for higher segment counts when messaging in non-Latin scripts
Turn on smart encoding on your messaging profile to automatically handle Unicode-to-GSM-7 substitutions. This is the single biggest cost-saving measure.
2
Validate before sending
Use the encoding detection helpers above to check segment counts before sending. Alert your application when messages will be unexpectedly expensive.
3
Sanitize input text
If you accept user-generated content, sanitize it before sending. Strip or replace invisible Unicode characters, curly quotes, and other common problem characters.
4
Keep messages concise
Stay under 160 characters (GSM-7) or 70 characters (UTF-16) to avoid multi-part message overhead. Each additional segment adds 7 characters of UDH overhead.
5
Use the right channel
For messages that need emojis, rich formatting, or non-Latin scripts, consider MMS or RCS instead of SMS.