Introduction

Follow some guidance to configure different LLM providers from those supported.

See also the default configuration that can be overridden with environment variables.

If you followed the getting started guide, ensure you have the latest images with docker compose pull

Providers and models

The Toolkit supports different LLM providers for inference and embeddings.

The list of supported providers follows

Those can be configured in the API .env and the used in an application configuration.

A model can be selected following this pattern provider/model such as openai/gpt-4o or gemini/gemini-1.5-pro

The model selection happens at inference time by tagging a request with a special label.

Current labels are

chat chat with the user
tools identify a tool (or function call) in a list from the context
sentiment provide sentiment analysis over text
tasks identify and contextualize structured tasks
intent identify the user intent from a list of options
translation translate text between languages

Configuring the application settings

In settings.yaml or app.yaml under settings section add or update the following lines, adapting the models to your needs.

llm:
  chat: openai/gpt-4o
  tools: huggingface/meta-llama/Meta-Llama-3.1-8B-Instruct
  sentiment: openai/gpt-4o-mini
  tasks: gemini/gemini-1.5-pro
# The following are not specified. The Toolkit will use a model from the provider specified in .env as LLM_SERVICE
#   intent: ...
#   translation: ...

The pattern to follow is [provider]/[model]. The list of available models is visible in the kiosk UI, opening the left menu, under LLM settings.

In the above configurations, providers can be mixed to obtain the best experience or precision needed for a specific activities.

Note If not specified, the Toolkit will use a model from the specified LLM_SERVICE. If not set, the default is openai.

Ensure the default LLM_SERVICE is configured or the call to that service will fail!

Updating the application

Reimport the app from the CLI sermas-cli app save /apps/myapp

Reloading the page at http://localhost:8080 you can start using the configured model.

Providers and models​

Configuring the application settings​

Updating the application​

Providers and models

Configuring the application settings

Updating the application