Hugging Face

Hugging Face offers a Pro subscription for LLM inference and custom hardware for enterprise usage

Check Hugging Face docs to create a token

Configure the Toolkit API

Locate the file ./config/api/.env and add the following configurations

LLM_SERVICE=huggingface

# HUGGINGFACE_API_KEY Provide a Hugging Face token. 
# Note: HF_TOKEN will be used as fallback, if set
HUGGINGFACE_API_KEY='HF token'
# HUGGINGFACE_CHAT_MODELS Supported chat models from Hugging Face. Leave empty to allow any.
# HUGGINGFACE_CHAT_MODELS=''

# HUGGINGFACE_BASEURL Base URL for Hugging Face inference service
# Set this for custom inference endpoint or to point to completion compatible backend such as vLLM
# HUGGINGFACE_BASEURL=''

Note that huggingface models have an addtional / in the name. A format like huggingface/meta-llama/Meta-Llama-3.1-8B-Instruct is correct.

Configure the Toolkit API​

Configure the Toolkit API