AI Outbound Agent State

The AIOutboundAgentState extends the regular AI Agent state to automate outbound interactions—e.g., phone calls, chat messages, or messaging-app conversations—directly from a workflow. In addition to the usual LLM configuration, tools, and outcomes, the state lets you specify:

Outbound channel details (phone, Zalo, WhatsApp, Telegram, …) via outboundConfig
Realtime voice features (STT/TTS/VAD) via voiceConfig

AIOutboundAgentState

Parameter

Description

Type

Required

agentName

The name of the agent.

string

yes

aiModel

The name of AI Language Model. Default value is 'gpt-4o'.

string

llmConfig

The configuration for the language model.

object

systemMessage

The system message used for constructing LLM prompt. Defaults to "You are a helpful AI Assistant."

string

yes

userMessage

The user message.

string

yes

maxToolExecutions

The maximum number of tool executions. Default is 10.

integer

memory

The memory of the agent. If not specify, the workflow process instance scope is used.

object

output

JSON schema for agent data output. See AgentDataOutput.

object

yes

tools

Define list of tools. Each tool is described by the ToolForAI schema.

array

agentOutcomes

Define list of agent outcomes. Each outcome is described by the OnAgentOutcome schema.

array

yes

dataFilter

Filter to apply to the state data.

string

outboundConfig

Channel-specific outbound settings.

object

yes

voiceConfig

Voice features (STT, TTS, VAD) for realtime calls.

object

OutboundConfig

The OutboundConfig defines the channel-specific outbound settings.

Property

Type

Description

Required

greeting

string

The static greeting message to be used by the agent. This message is used when the agent is first initialized.

greetingInstructions

string

The instructions for the LLM to use when generating the greeting message. This configuration takes precedence over the static greeting message.

outboundTarget

object

The outbound target configuration. This configuration is used to define the target for the outbound agent.

yes

OutboundTarget

The OutboundTarget defines the target for the outbound agent.

Property

Type

Description

Required

targetType

string

The type of the outbound target. This can be voice, zalo, whatsapp, etc. Default is voice.

yes

targetAddress

string

The address of the outbound target. This can be an email address, phone number, Zalo ID, etc. The format depends on the target type.

yes

targetName

string

The name of the outbound target. This is used for identification purposes.

yes

VoiceConfig

VoiceConfig is the single block that tells the workflow how to listen, think, and speak during a telephone or voice-chat session. Because speech is a round-trip of audio → text → LLM → text → audio, VoiceConfig is split into four conceptual sub-modules, each matching one step in that loop:

VAD – “Is anyone talking right now?”
STT – “What did they just say?”
LLM – “How should the agent respond?”
TTS – “Say it out loud—in a human voice.”

The pipeline looks like this:

Caller audio → VAD → STT ─┐
                         ├─► LLM (reasoning/JSON tools)
LLM reply ◄──────────────┘
LLM reply → TTS → Agent audio

We can mix-and-match providers for every step; each has its own latency, cost, language coverage, and feature set.

Why each component matters

Voice-Activity Detection (VAD)

Purpose: Detects the precise start and end of human speech in the inbound audio stream. Why it’s critical: If VAD fires too late you waste the caller’s first syllables; if it fires too early you feed silence or background noise into STT and spend tokens on “uh …”. Good VAD also enables barge-in (interrupting TTS mid-sentence) and double-talk detection.

Typical knobs inside the vad block

provider name (silero is the default implementation)
energy / probability thresholds
timeouts for “no-speech” and “end-of-speech”

Speech-to-Text (STT)

Purpose: Transforms raw audio chunks into partial and final transcripts. Why it’s critical: Whatever the LLM “hears” comes from STT; recognition accuracy drives the entire conversational quality. Latency drives perceived responsiveness.

Key configuration areas

Provider & model – e.g. OpenAI Whisper large-v3, Deepgram Nova-2, Google STT tel-alpha
Language/locale – supply a BCP-47 code like vi-VN or en-US so the model loads the right phoneme set
Streaming vs batch – most providers stream; some cheaper models require a full clip upload
Vocabulary bias / hints – business terms, proper names, SKU codes
Post-processing – capitalization, profanity masking, punctuation injection
Security – API key, private endpoint, or on-prem GPU deployment

Large-Language Model (LLM)

Purpose: Understands user intent, decides on tool calls, chooses the next action, and produces a textual reply (or a JSON payload if your state’s output schema demands structured data).

Inside a voice agent the LLM sits in the tightest latency loop after STT, so choosing how the LLM delivers its tokens changes the entire user experience:

Realtime LLM
- A realtime model ingests raw audio, reasons over it, and streams synthetic speech back without any external STT or TTS step.
- What changes in the pipeline
  - No separate STT/TTS blocks. The model hears tone, hesitations, laughter—cues that are normally lost in a transcript.
  - Built-in turn detection. Most providers decide when you’ve finished speaking; We recommends relying on that internal detector. If you want to fall back to default turn-detector you must still bolt on an STT plugin so the detector can read interim transcripts.
  - No hard-scripted speech. You can cue the model with instructions, but you cannot guarantee it will read a line verbatim. For legally approved disclaimers, attach a conventional TTS plugin and use greetingInstructions for that segment.
Non-realtime LLM (Classic)

The classic voice-AI stack separates concerns:

Caller audio → VAD → STT → text
                          ↓
                       LLM  ← current context & tools
                          ↓
Reply text → TTS → audio to caller

Why it’s still popular:

Advantage

Implication

Deterministic text flow. Every turn yields clean, timestamped transcripts.

Great for analytics, compliance, post-call RAG pipelines.

Fine-grained control. You choose best-of-breed STT, specialised LLM tooling, and premium or budget TTS per use-case.

Extra integration work and ~1-2 s additional latency.

Script fidelity. A TTS engine will read a legal disclaimer exactly as written.

Voices may sound less expressive unless you invest in neural styles.

Text-to-Speech (TTS)

Purpose: Transforms the LLM’s textual reply into audio the caller hears. Why it’s critical: Humans judge “bot-ness” mainly by voice quality and timing. A 220 ms chunk-synthesis delay feels natural; 800 ms feels robotic.

TTS options worth documenting

Voice/character – Rachel, en-US-Wavenet-D, Alloy-en-v2
Style & prosody controls – speaking rate, pitch, emotion, stability, pronunciation lexicons
Streaming support – mandatory for realtime pipelines; optional for batch
Silence trimming & filler – some providers auto-trim leading breaths; some insert breathing/fillers you may want to disable
Bandwidth – telephony lines are 8 kHz mono; web or app can handle 22 kHz stereo

VoiceConfig Properties

Property

Type

Description

Required

stt

object

The speech-to-text configuration (Optional).

tts

object

The text-to-speech configuration (Optional).

vad

object

The voice activity detection configuration (Optional).

allowInterruptions

boolean

Whether to allow interruptions during the voice interaction. Default is false.

STT

The STT defines the configuration for the Speech To Text (STT) to be used by the AI Agent.

Property

Type

Description

Required

provider

string

The name of the STT provider. Allowed values: 'deepgram', 'openai', 'google', 'elevenlabs', 'fal', 'groq'. This determines which STT backend will be used.

yes

model

string

The model to use for speech recognition. This is provider-specific.

language

string

The language code for recognition. This is provider-specific. Example: 'en-US', 'vi-VN', etc.

apiKey

string

The API key or credentials for the STT service. This is required for most providers to authenticate requests.

baseUrl

string

The base URL for the STT service. This is used for custom endpoints or self-hosted deployments. Optional for most cloud providers.

providerOptions

object

Provider-specific configuration options for STT. Use this to supply additional settings required by your provider.

Supported options by STT provider:

deepgram:

detect_language: Whether to enable automatic language detection. Defaults to false.
interim_results: Whether to return interim (non-final) transcription results. Defaults to true.
punctuate: >-
  Whether to add punctuations to the transcription. Defaults to true. Turn
  detector will work better with punctuations.
smart_format: Whether to apply smart formatting to numbers, dates, etc. Defaults to true.
sample_rate: The sample rate of the audio in Hz. Defaults to 16000.
endpointing_ms: >-
  Time in milliseconds of silence to consider end of speech. Set to 0 to
  disable. Defaults to 25.
filler_words: >-
  Whether to include filler words (um, uh, etc.) in transcription. Defaults to
  true.
profanity_filter: Whether to filter profanity from the transcription. Defaults to false.
numerals: Whether to include numerals in the transcription. Defaults to false.
mip_opt_out: Whether to take part in the model improvement program, Defaults to false.

Example:


"stt": {
    "provider": "groq",
    "model": "whisper-large-v3-turbo",
    "language": "vi",
    "apiKey": "${ $SECRETS.GROQ_API_KEY }"
}

openai:

detect_language: Whether to enable automatic language detection. Defaults to false.

google:

languages: List of language codes to recognize, Google STT can accept multiple languages
detect_language: Whether to enable automatic language detection. Defaults to false.
credentials_info: >-
  Google credentials info. This is a JSON string with the following fields:
  project_id, client_email, private_key_id, private_key. See
  https://cloud.google.com/docs/authentication/getting-started for more
  information.
credentials_file: URL to a file containing Google credentials.
credentials_file_auth_headers: HTTP headers to use when downloading the credentials file.

elevenlabs: None
fal: None
groq: None

Example:

"stt": {
    "provider": "groq",
    "model": "whisper-large-v3-turbo",
    "language": "vi",
    "apiKey": "${ $SECRETS.GROQ_API_KEY }"
}

azure: None

"stt": {
    "provider": "azure",
    "providerOptions": {
        "speech_key": "${ $SECRETS.AZURE_SPEECH_KEY }",
        "speech_region": "${ $SECRETS.AZURE_SPEECH_REGION }"
    }
}

TTS

The TTS defines the configuration for Text To Speech (TTS) to be used by the AI Agent

Property

Type

Description

Required

provider

string

The name of the TTS provider. Allowed values: 'openai', 'deepgram', 'google', 'elevenlabs', 'groq'. This determines which TTS backend will be used.

model

string

The model to use for TTS. This is provider-specific.

voice

string

The voice to use for TTS. This is provider-specific and may refer to a named voice (e.g., 'en-US-Wavenet-D' for Google, 'Rachel' for ElevenLabs, etc.).

language

string

The language code for TTS. This is provider-specific and may affect pronunciation and available voices.

apiKey

string

The API key or credentials for the TTS service. This is required for most providers to authenticate requests.

baseUrl

string

The base URL for the TTS service. This is used for custom endpoints or self-hosted deployments. Optional for most cloud providers.

providerOptions

object

Provider-specific configuration options for TTS. Use this to supply additional settings required by your provider.

Supported options by TTS provider:

openai:

speed: Speaking speed
instructions: >-
  Instructions to control tone, style, and other characteristics of the speech.
  Does not work with tts-1 or tts-1-hd models

deepgram:

encoding: Audio encoding, eg: linear16
sample_rate: Sample rate, eg: 24000

google:

gender: Voice gender. Valid values are male, female, and neutral.
credentials_info: >-
  Google credentials info. This is a JSON string with the following fields:
  project_id, client_email, private_key_id, private_key. See
  https://cloud.google.com/docs/authentication/getting-started for more
  information.
credentials_file: URL to a file containing Google credentials.
credentials_file_auth_headers: HTTP headers to use when downloading the credentials file.

Example:

"tts": {
    "provider": "google",
    "voice": "vi-VN-Standard-A",
    "apiKey": "https://xbot-uat.hcm03.vstorage.vngcloud.vn/xbot-out-livekit-uat.json?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20250516T102347Z&X-Amz-SignedHeaders=host&X-Amz-Expires=518699&X-Amz-Credential=e45eff37790ca1f6a4e0174698dc9991%2F20250516%2FHCM03%2Fs3%2Faws4_request&X-Amz-Signature=84b2d4a465379427eeac86c0210d954ec6a6e7ac2e869600377134311093ae89",
    "instructions": "Speak in a friendly and professional manner. Use Vietnamese language to communicate with the customer."
}

elevenlabs:

voice_settings: >-
  Voice settings object: stability, similarity_boost, style, use_speaker_boost,
  speed. See
  https://elevenlabs.io/docs/api-reference/text-to-speech/convert#request.body.voice_settings
streaming_latency: >-
  Optimize for streaming latency, defaults to 0 - disabled. 4 for max latency
  optimizations

groq: None
azure:

Example:

"tts": {
    "provider": "azure",
    "providerOptions": {
        "speech_key": "${ $SECRETS.AZURE_SPEECH_KEY }",
        "speech_region": "${ $SECRETS.AZURE_SPEECH_REGION }"
    }
}

VAD

The VAD defines the configuration for the Voice Activity Detection (VAD) to be used by the AI Agent

Property

Type

Description

Required

provider

string

The name of the provider. Default is 'silero'.

Example:

YAML

code: BANKING_AGENT
name: Banking aGENT
agents:
  - type: aiagent
    name: AgentSelector
    start: true
    transition:
      end: false
      targetId: RagAgent
    answerAfterFinish: false
    answerMessage: null
    aiModel: gpt-4o-mini
    llmConfig:
      provider: openai
      apiKey: '${ $SECRETS.OPENAI_API_KEY }'
      overrideParams:
        model: gpt-4o-mini
        temperature: 0
        topP: 0
        'n': 0
        logprobs: 0
        echo: false
        stop: []
        maxTokens: 1024
        presencePenalty: 0
        frequencyPenalty: 0
        logitBias: null
        required: null
    systemMessage: >-
      You are the VietTrust Bank assistant, helping to address queries related
      to information regarding the bank’s offerings, including branch locations,
      account types, savings products, loan services, credit card options, and
      other banking solutions. Select the most appropriate tool based on its
      description, and return the name of the selected tool.


      Bellow is tools list:

      - RagAgent: This tool specializes in answering queries regarding the
      bank’s offerings, including branch locations, account types, savings
      products, loan services, credit card options, and other banking solutions.
      Use this tool to access up-to-date data from the system's knowledge base.

      - OutboundCollectorAgent: This tool is designed to assist with outbound
      collection tasks, such as sending reminders or notifications to customers
      regarding their account status or payment due dates. Use this tool for
      tasks related to customer outreach and communication.
    userMessage: '${.request.question}'
    maxToolExecutions: 10
    memory:
      memoryId: '${ $conversationId + "-AgentSelector" }'
      memoryType: message_window
      maxMessages: 5
      maxTokens: null
      memoryOptimizer: null
    output:
      schema: |-
        {
            "type": "object",
            "properties": {
                "selectedAgent": {
                    "type": "string",
                    "description": "The exact name of agent to be selected."
                }
            },
            "required": [
                "selectedAgent"
            ]
        }
    tools: []
    askUserForToolsInput: false
    agentOutcomes:
      - condition: >-
          ${ ( ($agentOutcome.returnValues.selectedAgent != null) and
          (($agentOutcome.returnValues.selectedAgent |
          contains("OutboundCollectorAgent")) == true) ) or (
          ($agentOutcome.returnValues.properties.selectedAgent != null) and
          (($agentOutcome.returnValues.properties.selectedAgent  |
          contains("OutboundCollectorAgent")) == true) ) }
        finish: true
        transition:
          end: false
          targetId: OutboundCollectorAgent
        answerAfterFinish: false
        answerMessage: null
      - condition: >-
          ${ ( ($agentOutcome.returnValues.selectedAgent != null) and
          (($agentOutcome.returnValues.selectedAgent | contains("RagAgent")) ==
          true) ) or ( ($agentOutcome.returnValues.properties.selectedAgent !=
          null) and (($agentOutcome.returnValues.properties.selectedAgent |
          contains("RagAgent")) == true) ) }
        finish: true
        transition:
          end: false
          targetId: RagAgent
        answerAfterFinish: false
        answerMessage: null
      - condition: '${true}'
        finish: true
        transition:
          end: false
          targetId: RagAgent
        answerAfterFinish: false
        answerMessage: null
    loginRequired: null
  - type: kbagent
    name: RagAgent
    start: false
    transition:
      end: true
      targetId: null
    answerAfterFinish: true
    answerMessage: '${ $RagAgent.outputMessage }'
    systemMessage: >-
      You are an AI assistant for **VietTrust Bank**, a modern bank that offers
      a wide range of financial services. Your role is to assist customers by
      providing accurate and helpful information regarding the bank’s offerings,
      including branch locations, account types, savings products, loan
      services, credit card options, and other banking solutions.


      ### Instructions:


      1. **Branch Locations:**
         - Provide customers with details about the nearest branch based on the list of locations in the context. Use the city, district, or address details provided by the customer to find the nearest branch.

      2. **Account Types:**
         - Explain the available types of bank accounts, including **savings accounts**, **current accounts**, and **foreign currency accounts**. Provide details such as minimum balance requirements, interest rates, and benefits for each type of account.

      3. **Savings Products:**
         - Offer information about **savings products**, including fixed-term deposits and special savings plans. Provide details such as interest rates, account terms, and any specific conditions (e.g., minimum deposit amounts).

      4. **Loan Products:**
         - Assist customers in understanding the bank’s **loan offerings**, including **personal loans**, **home loans**, **auto loans**, and **business loans**. Mention the interest rates, repayment periods, and any special conditions based on the loan type.

      5. **Credit Cards:**
         - Provide information about the bank’s **credit card options**, including **credit limits**, **annual fees**, and **reward programs**. Highlight specific benefits for different types of cards, such as Visa, MasterCard, or co-branded cards.

      6. **Banking Services:**
         - Explain additional banking services, such as **online banking**, **mobile banking**, **money transfers**, and **bill payment** services.
         - Guide customers through the process of setting up or managing these services if the context provides instructions.

      7. **Priority Banking:**
         - For **priority customers**, provide details about exclusive services such as higher interest rates, lower service fees, and access to dedicated account managers. If the customer qualifies, inform them about the advantages of priority banking.

      8. **Cross-References:**
         - If a question requires information from multiple sections, provide cross-references to relevant sections (e.g., linking loan products with interest rates or providing details about applicable savings account terms).

      9. **Missing Context:**
         - If a customer asks for information that is not available in the provided context, respond politely with: **"Xin lỗi, tôi không có thông tin để trả lời câu hỏi này."** ("Sorry, I don’t have the information to answer this question.").

      10. **Use Only Provided Context:**
         - You must **only use the information provided in the context** to answer the questions. Do not use any external knowledge or assumptions. If the required information is not available in the context, respond politely with, **"Xin lỗi, tôi không có thông tin để trả lời câu hỏi này."** ("Sorry, I don’t have the information to answer this question.").

      ### Example Queries:

      - "Where is the nearest VietTrust Bank branch in Hanoi?"

      - "What is the interest rate for a 12-month savings account?"

      - "Can I apply for a home loan online?"

      - "What credit card options do you offer with no annual fee?"

      - "How do I transfer money to another bank account?"


      ### Language and Tone:

      - Provide all responses in **Vietnamese**, ensuring that they are clear,
      polite, and strictly based on the context provided by VietTrust Bank. If
      the required information is missing, respond politely and do not make any
      assumptions.
    userMessage: '${.request.question}'
    llmConfig:
      provider: groq
      apiKey: '${ $SECRETS.GROQ_API_KEY }'
      overrideParams:
        model: meta-llama/llama-4-scout-17b-16e-instruct
        temperature: 0
        topP: 0
        'n': 0
        logprobs: 0
        echo: false
        stop: []
        maxTokens: 512
        presencePenalty: 0
        frequencyPenalty: 0
        logitBias: null
        required: null
    memory:
      memoryId: '${ $conversationId }'
      memoryType: message_window
      maxMessages: 5
      maxTokens: null
    knowledgeBase:
      queryStrategy: null
      knowledgeBaseCodes:
        - XFILE_BANKING
      ragConfig:
        history:
          messageLimit: 5
        retriever:
          maxResults: 5
          minScore: 0.6
          kbLocalRerank: true
          includeDocReference: true
    ragMode: NAIVE
    loginRequired: null
    streaming: false
  - type: aiagent
    name: OutboundCollectorAgent
    start: false
    transition:
      end: false
      targetId: OutboundAgent
    answerAfterFinish: false
    aiModel: gpt-4.0-mini
    llmConfig:
      provider: openai
      apiKey: '${ $SECRETS.OPENAI_API_KEY }'
      overrideParams:
        model: gpt-4o-mini
        temperature: 0
        topP: 0
        'n': 0
        logprobs: 0
        echo: false
        stop: []
        maxTokens: 1268
        presencePenalty: 0
        frequencyPenalty: 0
        logitBias: null
        required: null
    systemMessage: >-
      You are AI Assistant, dedicated to collect the information. Your task is
      to collect the outbound call information based on the question from the
      user. You will be provided with the outbound call information, and you
      need to ask the user for any missing information.
    userMessage: '${.request.question}'
    maxToolExecutions: 40
    memory:
      memoryId: '${ $conversationId }'
      memoryType: message_window
      maxMessages: 19
      maxTokens: null
    output:
      schema: >-
        {"type":"object","properties":{"customerName":{"type":"string"},"phoneNumber":{"type":"string"},"message":{"type":"string"}},"required":["customerName","phoneNumber","message"]}
    tools: []
    askUserForToolsInput: true
    agentOutcomes: []
    loginRequired: null
  - type: aioutboundagent
    name: OutboundAgent
    start: false
    transition:
      end: true
      targetId: null
    systemMessage: >-
      You are a conversational AI agent for **VietTrust Bank**, a modern
      financial institution offering a full suite of banking services. Your job
      is to place outbound calls to customers in order to deliver polite,
      professional reminders or notifications. For each call you will be given:


      - Customer’s profile (name, account/loan details, etc.)  

      - The scripted message you need to convey  

      - The customer’s phone number  


      Always speak in Vietnamese, using a courteous and friendly tone
      appropriate for a bank representative.  

      Before ending each call, always thank the customer for their time.  

       Your task is informing the customer the message from following instructions: 

      {{.request.question}}

       == Message: ==

       {{ $OutboundCollectorAgent.returnValues.message }}.

       You can also use provided tools to answer the customer's queries (if any) based on known information and don't ask it again: 
       - Customer phone: ${ $OutboundCollectorAgent.returnValues.phoneNumber }.
    maxToolExecutions: 10
    llmConfig:
      provider: google
      apiKey: '${ $SECRETS.GEMINI_API_KEY }'
      overrideParams:
        model: gemini-2.5-flash-preview-04-17
        temperature: 0
        topP: 0
        'n': 0
        logprobs: 0
        echo: false
        stop: []
        maxTokens: 512
        presencePenalty: 0
        frequencyPenalty: 0
        logitBias: null
        required: null
    memory:
      memoryId: '${ $conversationId }'
      memoryType: message_window
      maxMessages: 5
      maxTokens: null
    tools:
      - name: getCustomerInfoByPhone
        description: Get customer information based on phone number.
        parameters:
          schema: |-
            {
              "type": "object",
              "properties": {
                "phone": {"type": "string", "description": "Customer phone number."}
              },
              "required": ["phone"]
            }
        output:
          schema: |-
            {
              "type": "object",
              "properties": {
                "customerCode": {"type": "string"},
                "fullName": {"type": "string"},
                "phoneNumber": {"type": "string"},
                "address": {"type": "string"}
              },
              "required": ["customerCode","fullName","phoneNumber","address"]
            }
        execution:
          actionMode: sequential
          actions:
            - id: null
              name: getCustomerInfoByPhone
              condition: '${true}'
              functionRef:
                code: getCustomerInfo_bankingApi
                name: getCustomerInfo
                description: null
                asyncInvoke: false
                type: rest
                definition:
                  type: simple-rest
                  url: >-
                    https://xplatform-api-uat.a4b.vn/partner/xfai/api/mock/banking/getCustomerInfo
                  method: GET
                  headers:
                    xp-api-key: >-
                      0123456789
                  queryParams:
                    phone: '${ .phone }'
                  body: null
                  auth: null
                inputs:
                  - name: phone
                    value: null
              arguments:
                phone: '${ $toolArguments.phone }'
        metadata: null
      - name: getAccountsByPhone
        description: Get all accounts by customer phone number.
        parameters:
          schema: |-
            {
              "type": "object",
              "properties": {
                "phone": {"type": "string"}
              },
              "required": ["phone"]
            }
        output:
          schema: |-
            {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "accountId": {"type": "string"},
                  "customerCode": {"type": "string"},
                  "balance": {"type": "number"}
                },
                "required": ["accountId", "customerCode", "balance"]
              }
            }
        execution:
          actionMode: sequential
          actions:
            - id: null
              name: getAccountsByPhone
              condition: '${true}'
              functionRef:
                code: getAccountsByPhone_bankingApi
                name: getAccountsByPhone
                description: null
                asyncInvoke: false
                type: rest
                definition:
                  type: simple-rest
                  url: >-
                    https://xplatform-api-uat.a4b.vn/partner/xfai/api/mock/banking/getAccountsByPhone
                  method: GET
                  headers:
                    xp-api-key: >-
                      0123456789
                  queryParams:
                    phone: '${ .phone }'
                  body: null
                  auth: null
                inputs:
                  - name: phone
                    value: null
              arguments:
                phone: '${ $toolArguments.phone }'
        metadata: null
      - name: getLoansByCustomer
        description: Get all loans by customer code.
        parameters:
          schema: |-
            {
              "type": "object",
              "properties": {
                "customerCode": {"type": "string"}
              },
              "required": ["customerCode"]
            }
        output:
          schema: |-
            {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "loanId": {"type": "string"},
                  "customerCode": {"type": "string"},
                  "principal": {"type": "number"},
                  "interestRate": {"type": "number"},
                  "termMonths": {"type": "integer"},
                  "outstanding": {"type": "number"},
                  "createdDate": {"type": "string"}
                },
                "required": ["loanId", "customerCode", "principal", "interestRate", "termMonths", "outstanding", "createdDate"]
              }
            }
        execution:
          actionMode: sequential
          actions:
            - id: null
              name: getLoansByCustomer
              condition: '${true}'
              functionRef:
                code: getLoansByCustomer_bankingApi
                name: getLoansByCustomer
                description: null
                asyncInvoke: false
                type: rest
                definition:
                  type: simple-rest
                  url: >-
                    https://xplatform-api-uat.a4b.vn/partner/xfai/api/mock/banking/getLoansByCustomer
                  method: GET
                  headers:
                    xp-api-key: >-
                      0123456789
                  queryParams:
                    customerCode: '${ .customerCode }'
                  body: null
                  auth: null
                inputs:
                  - name: customerCode
                    value: null
              arguments:
                customerCode: '${ $toolArguments.customerCode }'
        metadata: null
      - name: getLoanPayments
        description: Get all payments for a loan.
        parameters:
          schema: |-
            {
              "type": "object",
              "properties": {
                "loanId": {"type": "string"}
              },
              "required": ["loanId"]
            }
        output:
          schema: |-
            {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "paymentId": {"type": "string"},
                  "date": {"type": "string"},
                  "amount": {"type": "number"}
                },
                "required": ["paymentId", "date", "amount"]
              }
            }
        execution:
          actionMode: sequential
          actions:
            - id: null
              name: getLoanPayments
              condition: '${true}'
              functionRef:
                code: getLoanPayments_bankingApi
                name: getLoanPayments
                description: null
                asyncInvoke: false
                type: rest
                definition:
                  type: simple-rest
                  url: >-
                    https://xplatform-api-uat.a4b.vn/partner/xfai/api/mock/banking/getLoanPayments
                  method: GET
                  headers:
                    xp-api-key: >-
                      0123456789
                  queryParams:
                    loanId: '${ .loanId }'
                  body: null
                  auth: null
                inputs:
                  - name: loanId
                    value: null
              arguments:
                loanId: '${ $toolArguments.loanId }'
        metadata: null
    output:
      schema: >-
        {"type":"object","properties":{"answer":{"type":"string"}},"required":["answer"]}
    streaming: false
    voiceConfig:
      allowInterruptions: false
      tts:
        provider: google
        voice: vi-VN-Standard-A
        apiKey: >-
          https://xbot-uat.hcm03.vstorage.vngcloud.vn/xbot-out-livekit-uat.json?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20250516T102347Z&X-Amz-SignedHeaders=host&X-Amz-Expires=518699&X-Amz-Credential=e45eff37790ca1f6a4e0174698dc9991%2F20250516%2FHCM03%2Fs3%2Faws4_request&X-Amz-Signature=84b2d4a465379427eeac86c0210d954ec6a6e7ac2e869600377134311093ae89
        instructions: >-
          Speak in a friendly and professional manner. Use Vietnamese language
          to communicate with the customer.
      stt:
        provider: groq
        model: whisper-large-v3-turbo
        language: vi
        apiKey: '${ $SECRETS.GROQ_API_KEY }'
    outboundConfig:
      greetingInstructions: >-
        Greet the customer then introduce yourself as a representative of
        VietTrust Bank. Ask if they have time to talk. Use Vietnamese language
        to communicate with the customer.
      outboundTarget:
        targetType: voice
        targetAddress: '${ $OutboundCollectorAgent.returnValues.phoneNumber }'
        targetName: '${ $OutboundCollectorAgent.returnValues.customerName }'
appCode: XCHATBOT
tenantCode: DEMO

This document provides a detailed view of the AIOutboundAgentState state and its related objects, including comprehensive schema definitions, required fields, and descriptions for each attribute within the AIOutboundAgentState and associated schemas. This specification ensures clarity and completeness for integrating Outboudn AI agents within serverless workflows.

PreviousUserProxyAgent State NextWorkflow Functions

Last updated 2 months ago

AIOutboundAgentState

LLMConfig

ChatMemory

AgentDataOutput

OnAgentOutcome

ToolForAI

OutboundConfig

OutboundTarget

VoiceConfig

VoiceConfig Properties

STT

Supported options by STT provider:

TTS

Supported options by TTS provider:

VAD

Example: