Multi LLM Configuration

Developers can define configurations for multiple AI language models and specify strategies to optimize their usage. This documentation explains the properties and strategies available for configuring multiple LLMs in the AI Agent State.

Multi LLM Configuration Properties

The multi LLM configuration allows developers to specify multiple AI language models and define strategies for using them. Below are the key properties that can be defined, along with examples for each:

Strategy

The strategy property defines how multiple AI language models should be used. It includes the mode of operation and the status codes that trigger the strategy.

Name: strategy
Type: object
Description: The strategy to use when multiple AI language models are employed. This can include strategies such as fallback mechanisms, load balancing, and other techniques to ensure robust and reliable AI performance.
Properties:
- Mode: Defines the mode for handling the request. It can be single, fallback, or loadbalance.
- OnStatusCodes: An array of status codes that trigger the strategy.

Example:

strategy:
  mode: "fallback"
  onStatusCodes: [429, 500]

Targets

The targets property specifies the list of AI language models to use. Each target includes details about the provider, API key, and optional configuration parameters.

Name: targets
Type: array
Description: The list of AI language models to use.
Properties:
- Provider: The name of the provider offering the AI language model services.
- API Key: The API key used to authenticate and access the AI language model services provided by the chosen provider.
- Weight: (Optional) The weight of the provider, used for load balancing.
- OverrideParams: (Optional) The parameters to override for the provider. See LLm Configuration OverrideParams.
- Strategy: (Optional) The strategy to use for the target.
- Targets: (Optional) The list of LLM provider configurations for the target.

Example:

targets:
  - provider: "openai"
    apiKey: "your-openai-api-key"
    weight: 1.0
    overrideParams:
      model: "gpt-4o"
      temperature: 0.7
      max_tokens: 1000
  - provider: "anthropic"
    apiKey: "your-anthropic-api-key"
    weight: 0.5
    overrideParams:
      model: "claude-2"
      temperature: 0.8
      max_tokens: 800

Supported Strategies

Single Mode

In single mode, the AI agent uses a single AI language model for handling requests. This is the simplest strategy and does not involve any fallback or load balancing.

Example:

strategy:
  mode: "single"
  onStatusCodes: [200]

Fallback Mode

In fallback mode, the AI agent attempts to use the primary AI language model. If it encounters specified status codes (e.g., errors or rate limits), it automatically switches to the next model in the list.

Example:

strategy:
  mode: "fallback"
  onStatusCodes: [429, 500]
targets:
  - provider: "openai"
    apiKey: "primary-api-key"
    overrideParams:
      model: "gpt-4o"
  - provider: "anthropic"
    apiKey: "secondary-api-key"
    overrideParams:
      model: "claude-2"

Loadbalance Mode

In loadbalance mode, the AI agent distributes requests across multiple AI language models based on their defined weights. This ensures efficient utilization of resources and prevents any single model from becoming a bottleneck.

Example:

strategy:
  mode: "loadbalance"
  onStatusCodes: [200]
targets:
  - provider: "openai"
    apiKey: "api-key-1"
    weight: 0.7
    overrideParams:
      model: "gpt-4o"
  - provider: "groq"
    apiKey: "api-key-2"
    weight: 0.3
    overrideParams:
      model: "llama3-70b"

Usage Example

To use a multi LLM configuration in your AI Agent State, you can define the configuration in the state definition. Here is an example of how to configure an AI Agent State using multiple LLMs with a fallback strategy:

- name: MultiLLMAgentState
  type: aiagent
  agentName: MultiLLMAgent
  aiModel: custom
  systemMessage: "You are an assistant designed to provide accurate answers."
  userMessage: '${ "User: " + .request.question }'
  output: 
    {
      "type": "object",
      "properties": {
          "response": {
              "type": "string",
              "description": "The AI's response to the user question"
          }
      },
      "required": ["response"]
    }
  maxToolExecutions: 5
  memory: 
    memoryId: "session123"
    memoryType: "message_window"
    maxMessages: 10
  tools:
    - name: SEARCH_DOCUMENTS
      description: "Search for relevant documents based on the user's query."
      parameters: 
        {
          "type": "object",
          "properties": {
              "query": {
                  "type": "string",
                  "description": "The search query"
              }
          },
          "required": ["query"]
        }
      output: 
        {
            "type": "object",
            "properties": {
                "documents": {
                    "type": "array",
                    "items": {
                        "type": "string",
                        "format": "uri"
                    }
                }
            },
            "required": ["documents"]
        }
  llmConfig: 
    strategy:
      mode: "fallback"
      onStatusCodes: [429, 500]
    targets:
      - provider: "openai"
        apiKey: "primary-api-key"
        overrideParams:
          model: "gpt-4o"
      - provider: "anthropic"
        apiKey: "secondary-api-key"
        overrideParams:
          model: "claude-2"
  agentOutcomes:
    - condition: '${ $agentOutcome.returnValues.response != null }'
      transition: SuccessState
    - condition: '${ $agentOutcome.returnValues.response == null }'
      transition: ErrorState

For more detailed information and advanced configurations, refer to the AI Agent State spec.

PreviousLLM Configuration NextChat Memory

Last updated 1 year ago