Skip to content

Commit 42e3226

Browse files
authored
feat: add voice formatting and transfer docs (#162)
1 parent 16b782d commit 42e3226

File tree

3 files changed

+392
-1
lines changed

3 files changed

+392
-1
lines changed
Lines changed: 246 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,246 @@
1+
## What is Voice Input Formatted?
2+
3+
When interacting with voice assistants, you might notice terms like `Voice Input Formatted` in call logs or system outputs. This article explains what this means, how it works, and why it's important for delivering clear and natural voice interactions.
4+
5+
Voice Input Formatted is a function that takes raw text from a language model (LLM) and cleans it up so text-to-speech (TTS) provider can read it more naturally. It’s **on by default** in your assistant’s voice provider settings, because it helps turn things like:
6+
7+
- `$42.50``forty two dollars and fifty cents`
8+
- `ST``STREET`,
9+
- or phone numbers → spaced digits (“1 2 3 4 5 6 7 8 9 0”).
10+
11+
If you prefer the raw, unchanged text, you can **turn off** these transformations, which we’ll show you later.
12+
13+
### Log Example
14+
15+
![Screenshot 2025-01-21 at 10.23.19.png](https://img.notionusercontent.com/s3/prod-files-secure%2Ffdafdda2-774c-49e6-8896-a352ff4d44f3%2Ff603f2bd-36cf-4085-a3bc-f76c89a1ef75%2FScreenshot_2025-01-21_at_10.23.19.png/size/w=2000?exp=1737581744&sig=yoEEQF-BcTTgEVBNdcZh9MWHye2moRsbUcxGPjATNX8)
16+
17+
## 1. Step-by-Step Transformations
18+
19+
When `Voice Input Formatted` runs, it calls a bunch of helper functions in a row. Each one focuses on a different kind of text pattern. The entire process happens in this order:
20+
21+
1. **removeAngleBracketContent**
22+
2. **removeMarkdownSymbols**
23+
3. **removePhrasesInAsterisks**
24+
4. **replaceNewLinesWithPeriods**
25+
5. **replaceColonsWithPeriods**
26+
6. **formatAcronyms**
27+
7. **formatDollarAmounts**
28+
8. **formatEmails**
29+
9. **formatDates**
30+
10. **formatTimes**
31+
11. **formatDistances, formatUnits, formatPercentages, formatPhoneNumbers**
32+
12. **formatNumbers**
33+
13. **Applying Replacements**
34+
35+
We’ll walk you through them using a **shorter example** than before.
36+
37+
### 1.1 Our Simpler Example Input
38+
39+
```
40+
Hello <tag> world
41+
**Wanted** to say *hi*
42+
We have NASA and .NET here,
43+
call me at 123-456-7890,
44+
price: $42.50
45+
and the date is 2023 05 10
46+
and time is 14:00
47+
Distance is 5km
48+
We might see 9999
49+
the address is 320 ST 21 RD
50+
my email is JOHN.DOE@example.COM
51+
52+
```
53+
54+
### 1.2 removeAngleBracketContent
55+
56+
- **What it does**: Removes `<anything>` unless it’s `<break>`, `<spell>`, or double angle brackets `<< >>`.
57+
- **Example effect**: `<tag>` gets removed.
58+
59+
**Result so far**:
60+
61+
```
62+
Hello world
63+
**Wanted** to say *hi*
64+
We have NASA and .NET here,
65+
call me at 123-456-7890,
66+
price: $42.50
67+
and the date is 2023 05 10
68+
and time is 14:00
69+
Distance is 5km
70+
We might see 9999
71+
the address is 320 ST 21 RD
72+
my email is JOHN.DOE@example.COM
73+
74+
```
75+
76+
### 1.3 removeMarkdownSymbols
77+
78+
- **What it does**: Removes `_`, ```, or `~`. Some versions also remove double asterisks, but that might happen in a later step (next function).
79+
80+
In this example, there’s `**Wanted**`, which _might_ remain if we strictly only remove `_`, backticks, and tildes. If the code does remove `**` as well, it’ll vanish here or in the next step. Let’s assume it doesn’t remove them in this step.
81+
82+
**Result**: _No real change if the code only targets `_` , ```, and `~`.\_
83+
84+
```
85+
Hello world
86+
**Wanted** to say *hi*
87+
...
88+
89+
```
90+
91+
### 1.4 removePhrasesInAsterisks
92+
93+
- **What it does**: Looks for `some text*` or `*some text**` and cuts it out.
94+
95+
In our text, we have `**Wanted**` and `*hi*`. Both get removed if the function is broad enough to remove single and double-asterisk blocks.
96+
97+
**Result**:
98+
99+
```
100+
Hello world
101+
to say
102+
We have NASA and .NET here,
103+
call me at 123-456-7890,
104+
price: $42.50
105+
and the date is 2023 05 10
106+
and time is 14:00
107+
Distance is 5km
108+
We might see 9999
109+
the address is 320 ST 21 RD
110+
my email is JOHN.DOE@example.COM
111+
112+
```
113+
114+
### 1.5 replaceNewLinesWithPeriods
115+
116+
- **What it does**: Turns line breaks into `.` or `.` and merges repeated periods.
117+
118+
Let’s say the above text has line breaks. After this step, it’s more of a single line (or fewer lines), each newline replaced by a period.
119+
120+
**Result** (roughly):
121+
122+
```
123+
Hello world . to say . We have NASA and .NET here, call me at 123-456-7890, price: $42.50 and the date is 2023 05 10 and time is 14:00 Distance is 5km We might see 9999 the address is 320 ST 21 RD my email is JOHN.DOE@example.COM
124+
125+
```
126+
127+
### 1.6 replaceColonsWithPeriods
128+
129+
- **What it does**: `:``.`
130+
131+
Our text has `price: $42.50`. That becomes `price. $42.50`.
132+
133+
**Result**:
134+
135+
```
136+
Hello world . to say . We have NASA and .NET here, call me at 123-456-7890, price. $42.50 ...
137+
138+
```
139+
140+
### 1.7 formatAcronyms
141+
142+
- **What it does**:
143+
- If something is in a known “to-lower” list (like `NASA`, `.NET`), it becomes lowercase (`nasa`, `.net`).
144+
- If it’s all-caps but not recognized, it might get spaced letters. If it has vowels, it’s left alone.
145+
146+
In the example:
147+
148+
- `NASA``nasa`
149+
- `.NET``.net`
150+
151+
### 1.8 formatDollarAmounts
152+
153+
- **What it does**: `$42.50` → “forty two dollars and fifty cents.”
154+
155+
### 1.9 formatEmails
156+
157+
- **What it does**: Replaces `@` with “ at ” and `.` with “ dot ” in emails.
158+
- `JOHN.DOE@example.COM``JOHN dot DOE at example dot COM`
159+
160+
### 1.10 formatDates
161+
162+
- **What it does**: `YYYY MM DD` → e.g. “Wednesday, May 10, 2023” (if valid).
163+
- `2023 05 10` become “Wednesday, May 10, 2023” (day name depends on how the code calculates it).
164+
165+
### 1.11 formatTimes
166+
167+
- **What it does**: `14:00``14` (since minutes are “00,” it remove them).
168+
- If it was `14:30`, it might become `14 30`.
169+
170+
### 1.12 formatDistances, formatUnits, formatPercentages, formatPhoneNumbers
171+
172+
- **Distances**: `5km` → “5 kilometers.”
173+
- **Units**: e.g. `43 lb` → “forty three pounds.”
174+
- **Percentages**: `50%` → “50 percent.”
175+
- **PhoneNumbers**: `123-456-7890``1 2 3 4 5 6 7 8 9 0`.
176+
177+
### 1.13 formatNumbers
178+
179+
- **What it does**:
180+
- Skips year-like numbers if they’re below current year(2025).
181+
- For large numbers above a cutoff (e.g. 1000 or 5000), it reads as digits.
182+
- Negative numbers: `9` → “minus nine.”
183+
- Decimals: `2.5` → “two point five.”
184+
185+
In our case, `9999` might be big enough to become spelled out (nine thousand nine hundred ninety nine) or digits spaced out, depending on the cutoff.
186+
187+
`2023` used with `05 10` might get turned into a date, so it’s handled by the date logic, not the plain number logic.
188+
189+
### 1.14 Applying Replacements (street-suffix expansions)
190+
191+
- **Runs last**. If you have user-defined replacements like `\bST\b``STREET`, `\bRD\b``ROAD`, it changes them after all the other steps.
192+
- So `320 ST 21 RD``320 STREET 21 ROAD`.
193+
194+
**End Result**: A single line of text with all the helpful expansions and transformations done.
195+
196+
## 2. Formatting Plan: Customization Options
197+
198+
The **Formatting Plan** governs how Voice Input Formatted works. Here are the main settings you can customize:
199+
200+
### 2.1 Enabled
201+
202+
Determines whether the formatting is applied.
203+
204+
- **Default**: `true`
205+
- To disable: Set `voice.chunkPlan.formatPlan.enabled = false`.
206+
207+
### 2.2 Number-to-Digits Cutoff
208+
209+
This decides when numbers are read as digits instead of words.
210+
211+
- **Default**: `2025` (current year).
212+
- The code generally **doesn’t** convert numbers below the current year (like `2025`) into spelled-out words, so it stays as digits if it’s obviously a year.
213+
- If a number is bigger than the cutoff (`numberToDigitsCutoff`), it reads digits out loud.
214+
- Negative numbers become “minus,” decimals get “point,” etc.
215+
- Example: With a cutoff of `2025`, numbers like `12345` will remain digits.
216+
- To ensure larger numbers are spelled out, set the cutoff higher, like `300000`. For example:
217+
- `30003` → “thirty thousand and three” (with a cutoff of `300000`).
218+
219+
### 2.3 Replacements
220+
221+
Allows exact or regex-based substitutions in text.
222+
223+
- **Example 1**: Replace `hello` with `hi`:`{ type: 'exact', key: 'hello', value: 'hi' }`.
224+
- **Example 2**: Replace words matching a pattern:`{ type: 'regex', regex: '\\\\b[a-zA-Z]{5}\\\\b', value: 'hi' }`.
225+
226+
### Note
227+
228+
Currently, only **replacements** and **number-to-digits cutoff** are exposed for customization. Other options, such as toggling acronym replacement, are not exposed to be toggled.
229+
230+
## 3. How to Turn It Off
231+
232+
By default, the entire pipeline is **on** because it helps TTS read better. To **turn it off**, set:
233+
234+
```
235+
voice.chunkPlan.enabled = false;
236+
// or
237+
voice.chunkPlan.formatPlan.enabled = false;
238+
```
239+
240+
Any of those flags being `false` means we **skip** calling `Voice Input Formatted`.
241+
242+
## 4. Conclusion
243+
244+
- `Voice Input Formatted` orchestrates a chain of mini-functions that together fix punctuation, expand abbreviations, and make text more readable out loud.
245+
- You can keep it **on** for better TTS results or **off** if you need the raw LLM output.
246+
- The final transformations, especially the user-supplied replacements (like street expansions), happen **last**, so keep that in mind it rely on other expansions earlier.

fern/calls/call-dynamic-transfers.mdx

Lines changed: 141 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,141 @@
1+
## Introduction to Transfer Destinations
2+
3+
Transferring calls dynamically based on context is an essential feature for handling user interactions effectively. This guide walks you through creating a custom transfer tool, linking it to your assistant, and handling transfer requests with detailed examples. Whether the destination is a phone number, SIP, or another assistant, you'll learn how to configure it seamlessly.
4+
5+
## Step 1: Create a Custom Transfer Tool
6+
7+
To get started, create a transfer tool with an empty `destinations` array:
8+
9+
```bash
10+
curl -X POST https://api.vapi.ai/tool \
11+
-H "Authorization: Bearer insert-private-key-here" \
12+
-H "Content-Type: application/json" \
13+
-d '{
14+
"type": "transferCall",
15+
"destinations": [],
16+
"function": {
17+
"name": "dynamicDestinationTransferCall"
18+
}
19+
}'
20+
```
21+
22+
This tool acts as a placeholder, allowing dynamic destinations to be defined at runtime.
23+
24+
## Step 2: Link the Tool to Your Assistant
25+
26+
After creating the tool, link it to your assistant. This connection enables the assistant to trigger the tool during calls.
27+
28+
## Step 3: Configure the Server Event
29+
30+
Select the `transfer-destination-request` server event in your assistant settings. This event sends a webhook to your server whenever a transfer is requested, giving you the flexibility to dynamically determine the destination.
31+
32+
## Step 4: Set Up Your Server
33+
34+
Ensure your server is ready to handle incoming requests. Update the assistant's server URL to point to your server, which will process transfer requests and respond with the appropriate destination or error.
35+
36+
## Step 5: Trigger the Tool and Process Requests
37+
38+
Use the following prompt to trigger the transfer tool:
39+
40+
```
41+
[TASK]
42+
trigger the dynamicDestinationTransferCall tool
43+
```
44+
45+
When triggered, the assistant sends a `transfer-destination-request` webhook to your server. This webhook contains all the necessary call details, such as transcripts and messages, allowing your server to process the request dynamically.
46+
47+
**Sample Request Payload:**
48+
49+
```json
50+
{
51+
"type": "transfer-destination-request",
52+
"artifact": {
53+
"messages": [...],
54+
"transcript": "Hello, how can I help you?",
55+
"messagesOpenAIFormatted": [...]
56+
},
57+
"assistant": { "id": "assistant123" },
58+
"phoneNumber": "+14155552671",
59+
"customer": { "id": "customer456" },
60+
"call": { "id": "call789", "status": "ongoing" }
61+
}
62+
```
63+
64+
## Step 6: Respond to Transfer Requests
65+
66+
Your server should respond with either a valid `destination` or an `error` to indicate why the transfer cannot be completed.
67+
68+
### Transfer Destination Request Response Payload
69+
70+
#### Assistant Destination
71+
72+
```json
73+
{
74+
"destination": {
75+
"type": "assistant",
76+
"message": "Connecting you to our support assistant.",
77+
"assistantName": "SupportAssistant",
78+
"transferMode": "rolling-history"
79+
}
80+
}
81+
```
82+
83+
Transfers the call to another assistant, specifying how the conversation history should be handled.
84+
85+
#### Number Destination
86+
87+
```json
88+
{
89+
"destination": {
90+
"type": "number",
91+
"message": "Connecting you to our support line.",
92+
"number": "+14155552671",
93+
"numberE164CheckEnabled": true,
94+
"callerId": "+14155551234",
95+
"extension": "101"
96+
}
97+
}
98+
```
99+
100+
Transfers the call to a specific phone number, with options for caller ID and extensions.
101+
102+
#### SIP Destination
103+
104+
```json
105+
{
106+
"destination": {
107+
"type": "sip",
108+
"message": "Connecting your call via SIP.",
109+
"sipUri": "sip:customer-support@domain.com",
110+
"sipHeaders": {
111+
"X-Custom-Header": "value"
112+
}
113+
}
114+
}
115+
```
116+
117+
Transfers the call to a SIP URI with optional custom headers.
118+
119+
### Error Response
120+
121+
If the transfer cannot be completed, respond with an error:
122+
123+
```json
124+
{
125+
"error": "Invalid destination specified."
126+
}
127+
```
128+
129+
- **Field**: `error`
130+
- **Description**: Provides a clear reason why the transfer failed.
131+
132+
## Destination or Error in Response
133+
134+
Every response to a transfer-destination-request must include either a `destination` or an `error`. These indicate the outcome of the transfer request:
135+
136+
- **Destination**: Provides details for transferring the call.
137+
- **Error**: Explains why the transfer cannot be completed.
138+
139+
## Conclusion
140+
141+
Dynamic call transfers empower your assistant to route calls efficiently based on real-time data. By implementing this flow, you can ensure seamless interactions and provide a better experience for your users.

0 commit comments

Comments
 (0)