AI Outbound Agent State
The AIOutboundAgentState
extends the regular AI Agent state to automate outbound interactions—e.g., phone calls, chat messages, or messaging-app conversations—directly from a workflow. In addition to the usual LLM configuration, tools, and outcomes, the state lets you specify:
Outbound channel details (phone, Zalo, WhatsApp, Telegram, …) via
outboundConfig
Realtime voice features (STT/TTS/VAD) via
voiceConfig
AIOutboundAgentState
Parameter
Description
Type
Required
agentName
The name of the agent.
string
yes
aiModel
The name of AI Language Model. Default value is 'gpt-4o'.
string
no
systemMessage
The system message used for constructing LLM prompt. Defaults to "You are a helpful AI Assistant."
string
yes
userMessage
The user message.
string
yes
maxToolExecutions
The maximum number of tool executions. Default is 10.
integer
no
LLMConfig
The same as LLMConfig from AIAgent State
ChatMemory
The same as ChatMemory from AIAgent State
AgentDataOutput
The same as AgentDataOutput from AIAgent State
OnAgentOutcome
The same as OnAgentOutcome from AIAgent State
ToolForAI
The same as ToolForAI from AIAgent State
OutboundConfig
The OutboundConfig
defines the channel-specific outbound settings.
greeting
string
The static greeting message to be used by the agent. This message is used when the agent is first initialized.
no
greetingInstructions
string
The instructions for the LLM to use when generating the greeting message. This configuration takes precedence over the static greeting message.
no
object
The outbound target configuration. This configuration is used to define the target for the outbound agent.
yes
OutboundTarget
The OutboundTarget
defines the target for the outbound agent.
targetType
string
The type of the outbound target. This can be voice
, zalo
, whatsapp
, etc. Default is voice
.
yes
targetAddress
string
The address of the outbound target. This can be an email address, phone number, Zalo ID, etc. The format depends on the target type.
yes
targetName
string
The name of the outbound target. This is used for identification purposes.
yes
VoiceConfig
VoiceConfig
is the single block that tells the workflow how to listen, think, and speak during a telephone or voice-chat session.
Because speech is a round-trip of audio → text → LLM → text → audio, VoiceConfig
is split into four conceptual sub-modules, each matching one step in that loop:
VAD – “Is anyone talking right now?”
STT – “What did they just say?”
LLM – “How should the agent respond?”
TTS – “Say it out loud—in a human voice.”
The pipeline looks like this:
Caller audio → VAD → STT ─┐
├─► LLM (reasoning/JSON tools)
LLM reply ◄──────────────┘
LLM reply → TTS → Agent audio
We can mix-and-match providers for every step; each has its own latency, cost, language coverage, and feature set.
Why each component matters
Voice-Activity Detection (VAD)
Purpose: Detects the precise start and end of human speech in the inbound audio stream. Why it’s critical: If VAD fires too late you waste the caller’s first syllables; if it fires too early you feed silence or background noise into STT and spend tokens on “uh …”. Good VAD also enables barge-in (interrupting TTS mid-sentence) and double-talk detection.
Typical knobs inside the vad
block
provider name (
silero
is the default implementation)energy / probability thresholds
timeouts for “no-speech” and “end-of-speech”
Speech-to-Text (STT)
Purpose: Transforms raw audio chunks into partial and final transcripts. Why it’s critical: Whatever the LLM “hears” comes from STT; recognition accuracy drives the entire conversational quality. Latency drives perceived responsiveness.
Key configuration areas
Provider & model – e.g. OpenAI Whisper large-v3, Deepgram Nova-2, Google STT tel-alpha
Language/locale – supply a BCP-47 code like
vi-VN
oren-US
so the model loads the right phoneme setStreaming vs batch – most providers stream; some cheaper models require a full clip upload
Vocabulary bias / hints – business terms, proper names, SKU codes
Post-processing – capitalization, profanity masking, punctuation injection
Security – API key, private endpoint, or on-prem GPU deployment
Large-Language Model (LLM)
Purpose: Understands user intent, decides on tool calls, chooses the next action, and produces a textual reply (or a JSON payload if your state’s output
schema demands structured data).
Inside a voice agent the LLM sits in the tightest latency loop after STT, so choosing how the LLM delivers its tokens changes the entire user experience:
Realtime LLM
A realtime model ingests raw audio, reasons over it, and streams synthetic speech back without any external STT or TTS step.
What changes in the pipeline
No separate STT/TTS blocks. The model hears tone, hesitations, laughter—cues that are normally lost in a transcript.
Built-in turn detection. Most providers decide when you’ve finished speaking; We recommends relying on that internal detector. If you want to fall back to default turn-detector you must still bolt on an STT plugin so the detector can read interim transcripts.
No hard-scripted speech. You can cue the model with instructions, but you cannot guarantee it will read a line verbatim. For legally approved disclaimers, attach a conventional TTS plugin and use
greetingInstructions
for that segment.
Non-realtime LLM (Classic)
The classic voice-AI stack separates concerns:
Caller audio → VAD → STT → text
↓
LLM ← current context & tools
↓
Reply text → TTS → audio to caller
Why it’s still popular:
Deterministic text flow. Every turn yields clean, timestamped transcripts.
Great for analytics, compliance, post-call RAG pipelines.
Fine-grained control. You choose best-of-breed STT, specialised LLM tooling, and premium or budget TTS per use-case.
Extra integration work and ~1-2 s additional latency.
Script fidelity. A TTS engine will read a legal disclaimer exactly as written.
Voices may sound less expressive unless you invest in neural styles.
Text-to-Speech (TTS)
Purpose: Transforms the LLM’s textual reply into audio the caller hears. Why it’s critical: Humans judge “bot-ness” mainly by voice quality and timing. A 220 ms chunk-synthesis delay feels natural; 800 ms feels robotic.
TTS options worth documenting
Voice/character – Rachel, en-US-Wavenet-D, Alloy-en-v2
Style & prosody controls – speaking rate, pitch, emotion, stability, pronunciation lexicons
Streaming support – mandatory for realtime pipelines; optional for batch
Silence trimming & filler – some providers auto-trim leading breaths; some insert breathing/fillers you may want to disable
Bandwidth – telephony lines are 8 kHz mono; web or app can handle 22 kHz stereo
VoiceConfig Properties
STT
The STT
defines the configuration for the Speech To Text (STT) to be used by the AI Agent.
provider
string
The name of the STT provider. Allowed values: 'deepgram', 'openai', 'google', 'elevenlabs', 'fal', 'groq'. This determines which STT backend will be used.
yes
model
string
The model to use for speech recognition. This is provider-specific.
no
language
string
The language code for recognition. This is provider-specific. Example: 'en-US', 'vi-VN', etc.
no
apiKey
string
The API key or credentials for the STT service. This is required for most providers to authenticate requests.
no
baseUrl
string
The base URL for the STT service. This is used for custom endpoints or self-hosted deployments. Optional for most cloud providers.
providerOptions
object
Provider-specific configuration options for STT. Use this to supply additional settings required by your provider.
Supported options by STT provider:
deepgram:
detect_language: Whether to enable automatic language detection. Defaults to false.
interim_results: Whether to return interim (non-final) transcription results. Defaults to true.
punctuate: >-
Whether to add punctuations to the transcription. Defaults to true. Turn
detector will work better with punctuations.
smart_format: Whether to apply smart formatting to numbers, dates, etc. Defaults to true.
sample_rate: The sample rate of the audio in Hz. Defaults to 16000.
endpointing_ms: >-
Time in milliseconds of silence to consider end of speech. Set to 0 to
disable. Defaults to 25.
filler_words: >-
Whether to include filler words (um, uh, etc.) in transcription. Defaults to
true.
profanity_filter: Whether to filter profanity from the transcription. Defaults to false.
numerals: Whether to include numerals in the transcription. Defaults to false.
mip_opt_out: Whether to take part in the model improvement program, Defaults to false.
Example:
"stt": {
"provider": "groq",
"model": "whisper-large-v3-turbo",
"language": "vi",
"apiKey": "${ $SECRETS.GROQ_API_KEY }"
}
openai:
detect_language: Whether to enable automatic language detection. Defaults to false.
google:
languages: List of language codes to recognize, Google STT can accept multiple languages
detect_language: Whether to enable automatic language detection. Defaults to false.
credentials_info: >-
Google credentials info. This is a JSON string with the following fields:
project_id, client_email, private_key_id, private_key. See
https://cloud.google.com/docs/authentication/getting-started for more
information.
credentials_file: URL to a file containing Google credentials.
credentials_file_auth_headers: HTTP headers to use when downloading the credentials file.
elevenlabs: None
fal: None
groq: None
Example:
"stt": {
"provider": "groq",
"model": "whisper-large-v3-turbo",
"language": "vi",
"apiKey": "${ $SECRETS.GROQ_API_KEY }"
}
azure: None
"stt": {
"provider": "azure",
"providerOptions": {
"speech_key": "${ $SECRETS.AZURE_SPEECH_KEY }",
"speech_region": "${ $SECRETS.AZURE_SPEECH_REGION }"
}
}
TTS
The TTS
defines the configuration for Text To Speech (TTS) to be used by the AI Agent
provider
string
The name of the TTS provider. Allowed values: 'openai', 'deepgram', 'google', 'elevenlabs', 'groq'. This determines which TTS backend will be used.
no
model
string
The model to use for TTS. This is provider-specific.
no
voice
string
The voice to use for TTS. This is provider-specific and may refer to a named voice (e.g., 'en-US-Wavenet-D' for Google, 'Rachel' for ElevenLabs, etc.).
no
language
string
The language code for TTS. This is provider-specific and may affect pronunciation and available voices.
no
apiKey
string
The API key or credentials for the TTS service. This is required for most providers to authenticate requests.
no
baseUrl
string
The base URL for the TTS service. This is used for custom endpoints or self-hosted deployments. Optional for most cloud providers.
no
providerOptions
object
Provider-specific configuration options for TTS. Use this to supply additional settings required by your provider.
no
Supported options by TTS provider:
openai:
speed: Speaking speed
instructions: >-
Instructions to control tone, style, and other characteristics of the speech.
Does not work with tts-1 or tts-1-hd models
deepgram:
encoding: Audio encoding, eg: linear16
sample_rate: Sample rate, eg: 24000
google:
gender: Voice gender. Valid values are male, female, and neutral.
credentials_info: >-
Google credentials info. This is a JSON string with the following fields:
project_id, client_email, private_key_id, private_key. See
https://cloud.google.com/docs/authentication/getting-started for more
information.
credentials_file: URL to a file containing Google credentials.
credentials_file_auth_headers: HTTP headers to use when downloading the credentials file.
Example:
"tts": {
"provider": "google",
"voice": "vi-VN-Standard-A",
"apiKey": "https://xbot-uat.hcm03.vstorage.vngcloud.vn/xbot-out-livekit-uat.json?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20250516T102347Z&X-Amz-SignedHeaders=host&X-Amz-Expires=518699&X-Amz-Credential=e45eff37790ca1f6a4e0174698dc9991%2F20250516%2FHCM03%2Fs3%2Faws4_request&X-Amz-Signature=84b2d4a465379427eeac86c0210d954ec6a6e7ac2e869600377134311093ae89",
"instructions": "Speak in a friendly and professional manner. Use Vietnamese language to communicate with the customer."
}
elevenlabs:
voice_settings: >-
Voice settings object: stability, similarity_boost, style, use_speaker_boost,
speed. See
https://elevenlabs.io/docs/api-reference/text-to-speech/convert#request.body.voice_settings
streaming_latency: >-
Optimize for streaming latency, defaults to 0 - disabled. 4 for max latency
optimizations
groq: None
azure:
Example:
"tts": {
"provider": "azure",
"providerOptions": {
"speech_key": "${ $SECRETS.AZURE_SPEECH_KEY }",
"speech_region": "${ $SECRETS.AZURE_SPEECH_REGION }"
}
}
VAD
The VAD
defines the configuration for the Voice Activity Detection (VAD) to be used by the AI Agent
provider
string
The name of the provider. Default is 'silero'.
no
Example:
code: BANKING_AGENT
name: Banking aGENT
agents:
- type: aiagent
name: AgentSelector
start: true
transition:
end: false
targetId: RagAgent
answerAfterFinish: false
answerMessage: null
aiModel: gpt-4o-mini
llmConfig:
provider: openai
apiKey: '${ $SECRETS.OPENAI_API_KEY }'
overrideParams:
model: gpt-4o-mini
temperature: 0
topP: 0
'n': 0
logprobs: 0
echo: false
stop: []
maxTokens: 1024
presencePenalty: 0
frequencyPenalty: 0
logitBias: null
required: null
systemMessage: >-
You are the VietTrust Bank assistant, helping to address queries related
to information regarding the bank’s offerings, including branch locations,
account types, savings products, loan services, credit card options, and
other banking solutions. Select the most appropriate tool based on its
description, and return the name of the selected tool.
Bellow is tools list:
- RagAgent: This tool specializes in answering queries regarding the
bank’s offerings, including branch locations, account types, savings
products, loan services, credit card options, and other banking solutions.
Use this tool to access up-to-date data from the system's knowledge base.
- OutboundCollectorAgent: This tool is designed to assist with outbound
collection tasks, such as sending reminders or notifications to customers
regarding their account status or payment due dates. Use this tool for
tasks related to customer outreach and communication.
userMessage: '${.request.question}'
maxToolExecutions: 10
memory:
memoryId: '${ $conversationId + "-AgentSelector" }'
memoryType: message_window
maxMessages: 5
maxTokens: null
memoryOptimizer: null
output:
schema: |-
{
"type": "object",
"properties": {
"selectedAgent": {
"type": "string",
"description": "The exact name of agent to be selected."
}
},
"required": [
"selectedAgent"
]
}
tools: []
askUserForToolsInput: false
agentOutcomes:
- condition: >-
${ ( ($agentOutcome.returnValues.selectedAgent != null) and
(($agentOutcome.returnValues.selectedAgent |
contains("OutboundCollectorAgent")) == true) ) or (
($agentOutcome.returnValues.properties.selectedAgent != null) and
(($agentOutcome.returnValues.properties.selectedAgent |
contains("OutboundCollectorAgent")) == true) ) }
finish: true
transition:
end: false
targetId: OutboundCollectorAgent
answerAfterFinish: false
answerMessage: null
- condition: >-
${ ( ($agentOutcome.returnValues.selectedAgent != null) and
(($agentOutcome.returnValues.selectedAgent | contains("RagAgent")) ==
true) ) or ( ($agentOutcome.returnValues.properties.selectedAgent !=
null) and (($agentOutcome.returnValues.properties.selectedAgent |
contains("RagAgent")) == true) ) }
finish: true
transition:
end: false
targetId: RagAgent
answerAfterFinish: false
answerMessage: null
- condition: '${true}'
finish: true
transition:
end: false
targetId: RagAgent
answerAfterFinish: false
answerMessage: null
loginRequired: null
- type: kbagent
name: RagAgent
start: false
transition:
end: true
targetId: null
answerAfterFinish: true
answerMessage: '${ $RagAgent.outputMessage }'
systemMessage: >-
You are an AI assistant for **VietTrust Bank**, a modern bank that offers
a wide range of financial services. Your role is to assist customers by
providing accurate and helpful information regarding the bank’s offerings,
including branch locations, account types, savings products, loan
services, credit card options, and other banking solutions.
### Instructions:
1. **Branch Locations:**
- Provide customers with details about the nearest branch based on the list of locations in the context. Use the city, district, or address details provided by the customer to find the nearest branch.
2. **Account Types:**
- Explain the available types of bank accounts, including **savings accounts**, **current accounts**, and **foreign currency accounts**. Provide details such as minimum balance requirements, interest rates, and benefits for each type of account.
3. **Savings Products:**
- Offer information about **savings products**, including fixed-term deposits and special savings plans. Provide details such as interest rates, account terms, and any specific conditions (e.g., minimum deposit amounts).
4. **Loan Products:**
- Assist customers in understanding the bank’s **loan offerings**, including **personal loans**, **home loans**, **auto loans**, and **business loans**. Mention the interest rates, repayment periods, and any special conditions based on the loan type.
5. **Credit Cards:**
- Provide information about the bank’s **credit card options**, including **credit limits**, **annual fees**, and **reward programs**. Highlight specific benefits for different types of cards, such as Visa, MasterCard, or co-branded cards.
6. **Banking Services:**
- Explain additional banking services, such as **online banking**, **mobile banking**, **money transfers**, and **bill payment** services.
- Guide customers through the process of setting up or managing these services if the context provides instructions.
7. **Priority Banking:**
- For **priority customers**, provide details about exclusive services such as higher interest rates, lower service fees, and access to dedicated account managers. If the customer qualifies, inform them about the advantages of priority banking.
8. **Cross-References:**
- If a question requires information from multiple sections, provide cross-references to relevant sections (e.g., linking loan products with interest rates or providing details about applicable savings account terms).
9. **Missing Context:**
- If a customer asks for information that is not available in the provided context, respond politely with: **"Xin lỗi, tôi không có thông tin để trả lời câu hỏi này."** ("Sorry, I don’t have the information to answer this question.").
10. **Use Only Provided Context:**
- You must **only use the information provided in the context** to answer the questions. Do not use any external knowledge or assumptions. If the required information is not available in the context, respond politely with, **"Xin lỗi, tôi không có thông tin để trả lời câu hỏi này."** ("Sorry, I don’t have the information to answer this question.").
### Example Queries:
- "Where is the nearest VietTrust Bank branch in Hanoi?"
- "What is the interest rate for a 12-month savings account?"
- "Can I apply for a home loan online?"
- "What credit card options do you offer with no annual fee?"
- "How do I transfer money to another bank account?"
### Language and Tone:
- Provide all responses in **Vietnamese**, ensuring that they are clear,
polite, and strictly based on the context provided by VietTrust Bank. If
the required information is missing, respond politely and do not make any
assumptions.
userMessage: '${.request.question}'
llmConfig:
provider: groq
apiKey: '${ $SECRETS.GROQ_API_KEY }'
overrideParams:
model: meta-llama/llama-4-scout-17b-16e-instruct
temperature: 0
topP: 0
'n': 0
logprobs: 0
echo: false
stop: []
maxTokens: 512
presencePenalty: 0
frequencyPenalty: 0
logitBias: null
required: null
memory:
memoryId: '${ $conversationId }'
memoryType: message_window
maxMessages: 5
maxTokens: null
knowledgeBase:
queryStrategy: null
knowledgeBaseCodes:
- XFILE_BANKING
ragConfig:
history:
messageLimit: 5
retriever:
maxResults: 5
minScore: 0.6
kbLocalRerank: true
includeDocReference: true
ragMode: NAIVE
loginRequired: null
streaming: false
- type: aiagent
name: OutboundCollectorAgent
start: false
transition:
end: false
targetId: OutboundAgent
answerAfterFinish: false
aiModel: gpt-4.0-mini
llmConfig:
provider: openai
apiKey: '${ $SECRETS.OPENAI_API_KEY }'
overrideParams:
model: gpt-4o-mini
temperature: 0
topP: 0
'n': 0
logprobs: 0
echo: false
stop: []
maxTokens: 1268
presencePenalty: 0
frequencyPenalty: 0
logitBias: null
required: null
systemMessage: >-
You are AI Assistant, dedicated to collect the information. Your task is
to collect the outbound call information based on the question from the
user. You will be provided with the outbound call information, and you
need to ask the user for any missing information.
userMessage: '${.request.question}'
maxToolExecutions: 40
memory:
memoryId: '${ $conversationId }'
memoryType: message_window
maxMessages: 19
maxTokens: null
output:
schema: >-
{"type":"object","properties":{"customerName":{"type":"string"},"phoneNumber":{"type":"string"},"message":{"type":"string"}},"required":["customerName","phoneNumber","message"]}
tools: []
askUserForToolsInput: true
agentOutcomes: []
loginRequired: null
- type: aioutboundagent
name: OutboundAgent
start: false
transition:
end: true
targetId: null
systemMessage: >-
You are a conversational AI agent for **VietTrust Bank**, a modern
financial institution offering a full suite of banking services. Your job
is to place outbound calls to customers in order to deliver polite,
professional reminders or notifications. For each call you will be given:
- Customer’s profile (name, account/loan details, etc.)
- The scripted message you need to convey
- The customer’s phone number
Always speak in Vietnamese, using a courteous and friendly tone
appropriate for a bank representative.
Before ending each call, always thank the customer for their time.
Your task is informing the customer the message from following instructions:
{{.request.question}}
== Message: ==
{{ $OutboundCollectorAgent.returnValues.message }}.
You can also use provided tools to answer the customer's queries (if any) based on known information and don't ask it again:
- Customer phone: ${ $OutboundCollectorAgent.returnValues.phoneNumber }.
maxToolExecutions: 10
llmConfig:
provider: google
apiKey: '${ $SECRETS.GEMINI_API_KEY }'
overrideParams:
model: gemini-2.5-flash-preview-04-17
temperature: 0
topP: 0
'n': 0
logprobs: 0
echo: false
stop: []
maxTokens: 512
presencePenalty: 0
frequencyPenalty: 0
logitBias: null
required: null
memory:
memoryId: '${ $conversationId }'
memoryType: message_window
maxMessages: 5
maxTokens: null
tools:
- name: getCustomerInfoByPhone
description: Get customer information based on phone number.
parameters:
schema: |-
{
"type": "object",
"properties": {
"phone": {"type": "string", "description": "Customer phone number."}
},
"required": ["phone"]
}
output:
schema: |-
{
"type": "object",
"properties": {
"customerCode": {"type": "string"},
"fullName": {"type": "string"},
"phoneNumber": {"type": "string"},
"address": {"type": "string"}
},
"required": ["customerCode","fullName","phoneNumber","address"]
}
execution:
actionMode: sequential
actions:
- id: null
name: getCustomerInfoByPhone
condition: '${true}'
functionRef:
code: getCustomerInfo_bankingApi
name: getCustomerInfo
description: null
asyncInvoke: false
type: rest
definition:
type: simple-rest
url: >-
https://xplatform-api-uat.a4b.vn/partner/xfai/api/mock/banking/getCustomerInfo
method: GET
headers:
xp-api-key: >-
0123456789
queryParams:
phone: '${ .phone }'
body: null
auth: null
inputs:
- name: phone
value: null
arguments:
phone: '${ $toolArguments.phone }'
metadata: null
- name: getAccountsByPhone
description: Get all accounts by customer phone number.
parameters:
schema: |-
{
"type": "object",
"properties": {
"phone": {"type": "string"}
},
"required": ["phone"]
}
output:
schema: |-
{
"type": "array",
"items": {
"type": "object",
"properties": {
"accountId": {"type": "string"},
"customerCode": {"type": "string"},
"balance": {"type": "number"}
},
"required": ["accountId", "customerCode", "balance"]
}
}
execution:
actionMode: sequential
actions:
- id: null
name: getAccountsByPhone
condition: '${true}'
functionRef:
code: getAccountsByPhone_bankingApi
name: getAccountsByPhone
description: null
asyncInvoke: false
type: rest
definition:
type: simple-rest
url: >-
https://xplatform-api-uat.a4b.vn/partner/xfai/api/mock/banking/getAccountsByPhone
method: GET
headers:
xp-api-key: >-
0123456789
queryParams:
phone: '${ .phone }'
body: null
auth: null
inputs:
- name: phone
value: null
arguments:
phone: '${ $toolArguments.phone }'
metadata: null
- name: getLoansByCustomer
description: Get all loans by customer code.
parameters:
schema: |-
{
"type": "object",
"properties": {
"customerCode": {"type": "string"}
},
"required": ["customerCode"]
}
output:
schema: |-
{
"type": "array",
"items": {
"type": "object",
"properties": {
"loanId": {"type": "string"},
"customerCode": {"type": "string"},
"principal": {"type": "number"},
"interestRate": {"type": "number"},
"termMonths": {"type": "integer"},
"outstanding": {"type": "number"},
"createdDate": {"type": "string"}
},
"required": ["loanId", "customerCode", "principal", "interestRate", "termMonths", "outstanding", "createdDate"]
}
}
execution:
actionMode: sequential
actions:
- id: null
name: getLoansByCustomer
condition: '${true}'
functionRef:
code: getLoansByCustomer_bankingApi
name: getLoansByCustomer
description: null
asyncInvoke: false
type: rest
definition:
type: simple-rest
url: >-
https://xplatform-api-uat.a4b.vn/partner/xfai/api/mock/banking/getLoansByCustomer
method: GET
headers:
xp-api-key: >-
0123456789
queryParams:
customerCode: '${ .customerCode }'
body: null
auth: null
inputs:
- name: customerCode
value: null
arguments:
customerCode: '${ $toolArguments.customerCode }'
metadata: null
- name: getLoanPayments
description: Get all payments for a loan.
parameters:
schema: |-
{
"type": "object",
"properties": {
"loanId": {"type": "string"}
},
"required": ["loanId"]
}
output:
schema: |-
{
"type": "array",
"items": {
"type": "object",
"properties": {
"paymentId": {"type": "string"},
"date": {"type": "string"},
"amount": {"type": "number"}
},
"required": ["paymentId", "date", "amount"]
}
}
execution:
actionMode: sequential
actions:
- id: null
name: getLoanPayments
condition: '${true}'
functionRef:
code: getLoanPayments_bankingApi
name: getLoanPayments
description: null
asyncInvoke: false
type: rest
definition:
type: simple-rest
url: >-
https://xplatform-api-uat.a4b.vn/partner/xfai/api/mock/banking/getLoanPayments
method: GET
headers:
xp-api-key: >-
0123456789
queryParams:
loanId: '${ .loanId }'
body: null
auth: null
inputs:
- name: loanId
value: null
arguments:
loanId: '${ $toolArguments.loanId }'
metadata: null
output:
schema: >-
{"type":"object","properties":{"answer":{"type":"string"}},"required":["answer"]}
streaming: false
voiceConfig:
allowInterruptions: false
tts:
provider: google
voice: vi-VN-Standard-A
apiKey: >-
https://xbot-uat.hcm03.vstorage.vngcloud.vn/xbot-out-livekit-uat.json?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20250516T102347Z&X-Amz-SignedHeaders=host&X-Amz-Expires=518699&X-Amz-Credential=e45eff37790ca1f6a4e0174698dc9991%2F20250516%2FHCM03%2Fs3%2Faws4_request&X-Amz-Signature=84b2d4a465379427eeac86c0210d954ec6a6e7ac2e869600377134311093ae89
instructions: >-
Speak in a friendly and professional manner. Use Vietnamese language
to communicate with the customer.
stt:
provider: groq
model: whisper-large-v3-turbo
language: vi
apiKey: '${ $SECRETS.GROQ_API_KEY }'
outboundConfig:
greetingInstructions: >-
Greet the customer then introduce yourself as a representative of
VietTrust Bank. Ask if they have time to talk. Use Vietnamese language
to communicate with the customer.
outboundTarget:
targetType: voice
targetAddress: '${ $OutboundCollectorAgent.returnValues.phoneNumber }'
targetName: '${ $OutboundCollectorAgent.returnValues.customerName }'
appCode: XCHATBOT
tenantCode: DEMO
This document provides a detailed view of the AIOutboundAgentState
state and its related objects, including comprehensive schema definitions, required fields, and descriptions for each attribute within the AIOutboundAgentState
and associated schemas. This specification ensures clarity and completeness for integrating Outboudn AI agents within serverless workflows.
Last updated