Getting Started Tutorial
End-to-End tutorial for LiteLLM Proxy to:
- Add an Azure OpenAI model
- Make a successful /chat/completion call
- Generate a virtual key
- Set RPM limit on virtual key
Pre-Requisitesโ
- Install LiteLLM Docker Image OR LiteLLM CLI (pip package)
- Docker
- LiteLLM CLI (pip package)
- Docker Compose (Proxy + DB)
docker pull ghcr.io/berriai/litellm:main-latest
$ pip install 'litellm[proxy]'
Use this docker compose to spin up the proxy with a postgres database running locally.
# Get the docker compose file
curl -O https://raw.githubusercontent.com/BerriAI/litellm/main/docker-compose.yml
# Add the master key - you can change this after setup
echo 'LITELLM_MASTER_KEY="sk-1234"' > .env
# Add the litellm salt key - you cannot change this after adding a model
# It is used to encrypt / decrypt your LLM API Key credentials
# We recommend - https://1password.com/password-generator/
# password generator to get a random hash for litellm salt key
echo 'LITELLM_SALT_KEY="sk-1234"' >> .env
source .env
# Start
docker compose up
1. Add a modelโ
Control LiteLLM Proxy with a config.yaml file.
Setup your config.yaml with your azure model.
Note: When using the proxy with a database, you can also just add models via UI (UI is available on /ui route).
model_list:
- model_name: gpt-4o
litellm_params:
model: azure/my_azure_deployment
api_base: os.environ/AZURE_API_BASE
api_key: "os.environ/AZURE_API_KEY"
api_version: "2025-01-01-preview" # [OPTIONAL] litellm uses the latest azure api_version by default
Model List Specificationโ
You can read more about how model resolution works in the Model Configuration section.
model_name(str) - This field should contain the name of the model as received.litellm_params(dict) See All LiteLLM Paramsmodel(str) - Specifies the model name to be sent tolitellm.acompletion/litellm.aembedding, etc. This is the identifier used by LiteLLM to route to the correct model + provider logic on the backend.api_key(str) - The API key required for authentication. It can be retrieved from an environment variable usingos.environ/.api_base(str) - The API base for your azure deployment.api_version(str) - The API Version to use when calling Azure's OpenAI API. Get the latest Inference API version here.
Useful Linksโ
- All Supported LLM API Providers (OpenAI/Bedrock/Vertex/etc.)
- Full Config.Yaml Spec
- Pass provider-specific params
2. Make a successful /chat/completion callโ
LiteLLM Proxy is 100% OpenAI-compatible. Test your azure model via the /chat/completions route.
2.1 Start Proxyโ
Save your config.yaml from step 1. as litellm_config.yaml.
- Docker
- LiteLLM CLI (pip package)
docker run \
-v $(pwd)/litellm_config.yaml:/app/config.yaml \
-e AZURE_API_KEY=d6*********** \
-e AZURE_API_BASE=https://openai-***********/ \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-latest \
--config /app/config.yaml --detailed_debug
# RUNNING on http://0.0.0.0:4000
$ litellm --config /app/config.yaml --detailed_debug
Confirm your config.yaml got mounted correctly
Loaded config YAML (api_key and environment_variables are not shown):
{
"model_list": [
{
"model_name ...
2.2 Make Callโ
curl -X POST 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-1234' \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are an LLM named gpt-4o"
},
{
"role": "user",
"content": "what is your name?"
}
]
}'
Expected Response
{
"id": "chatcmpl-BcO8tRQmQV6Dfw6onqMufxPkLLkA8",
"created": 1748488967,
"model": "gpt-4o-2024-11-20",
"object": "chat.completion",
"system_fingerprint": "fp_ee1d74bde0",
"choices": [
{
"finish_reason": "stop",
"index": 0,
"message": {
"content": "My name is **gpt-4o**! How can I assist you today?",
"role": "assistant",
"tool_calls": null,
"function_call": null,
"annotations": []
}
}
],
"usage": {
"completion_tokens": 19,
"prompt_tokens": 28,
"total_tokens": 47,
"completion_tokens_details": {
"accepted_prediction_tokens": 0,
"audio_tokens": 0,
"reasoning_tokens": 0,
"rejected_prediction_tokens": 0
},
"prompt_tokens_details": {
"audio_tokens": 0,
"cached_tokens": 0
}
},
"service_tier": null,
"prompt_filter_results": [
{
"prompt_index": 0,
"content_filter_results": {
"hate": {
"filtered": false,
"severity": "safe"
},
"self_harm": {
"filtered": false,
"severity": "safe"
},
"sexual": {
"filtered": false,
"severity": "safe"
},
"violence": {
"filtered": false,
"severity": "safe"
}
}
}
]
}
Useful Linksโ
- All Supported LLM API Providers (OpenAI/Bedrock/Vertex/etc.)
- Call LiteLLM Proxy via OpenAI SDK, Langchain, etc.
- All API Endpoints Swagger
- Other/Non-Chat Completion Endpoints
- Pass-through for VertexAI, Bedrock, etc.
3. Generate a virtual keyโ
Track Spend, and control model access via virtual keys for the proxy
3.1 Set up a Databaseโ
Requirements
model_list:
- model_name: gpt-4o
litellm_params:
model: azure/my_azure_deployment
api_base: os.environ/AZURE_API_BASE
api_key: "os.environ/AZURE_API_KEY"
api_version: "2025-01-01-preview" # [OPTIONAL] litellm uses the latest azure api_version by default
general_settings:
master_key: sk-1234
database_url: "postgresql://<user>:<password>@<host>:<port>/<dbname>" # ๐ KEY CHANGE
Save config.yaml as litellm_config.yaml (used in 3.2).
What is general_settings?
These are settings for the LiteLLM Proxy Server.
See All General Settings here.
-
master_key(str)- Description:
- Set a
master key, this is your Proxy Admin key - you can use this to create other keys (๐จ must start withsk-).
- Set a
- Usage:
- Set on config.yaml set your master key under
general_settings:master_key, example -master_key: sk-1234 - Set env variable set
LITELLM_MASTER_KEY
- Set on config.yaml set your master key under
- Description:
-
database_url(str)- Description:
- Set a
database_url, this is the connection to your Postgres DB, which is used by litellm for generating keys, users, teams.
- Set a
- Usage:
- Set on config.yaml set your
database_urlundergeneral_settings:database_url, example -database_url: "postgresql://..." - Set
DATABASE_URL=postgresql://<user>:<password>@<host>:<port>/<dbname>in your env
- Set on config.yaml set your
- Description:
3.2 Start Proxyโ
docker run \
-v $(pwd)/litellm_config.yaml:/app/config.yaml \
-e AZURE_API_KEY=d6*********** \
-e AZURE_API_BASE=https://openai-***********/ \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-latest \
--config /app/config.yaml --detailed_debug
3.3 Create Key w/ RPM Limitโ
Create a key with rpm_limit: 1. This will only allow 1 request per minute for calls to proxy with this key.
curl -L -X POST 'http://0.0.0.0:4000/key/generate' \
-H 'Authorization: Bearer sk-1234' \
-H 'Content-Type: application/json' \
-d '{
"rpm_limit": 1
}'
Expected Response
{
"key": "sk-12..."
}
3.4 Test it!โ
Use your virtual key from step 3.3
1st call - Expect to work!
curl -X POST 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-12...' \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a helpful math tutor. Guide the user through the solution step by step."
},
{
"role": "user",
"content": "how can I solve 8x + 7 = -23"
}
]
}'
Expected Response
{
"id": "chatcmpl-2076f062-3095-4052-a520-7c321c115c68",
"choices": [
...
}
2nd call - Expect to fail!
Why did this call fail?
We set the virtual key's requests per minute (RPM) limit to 1. This has now been crossed.
curl -X POST 'http://0.0.0.0:4000/chat/completions' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer sk-12...' \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "system",
"content": "You are a helpful math tutor. Guide the user through the solution step by step."
},
{
"role": "user",
"content": "how can I solve 8x + 7 = -23"
}
]
}'
Expected Response
{
"error": {
"message": "LiteLLM Rate Limit Handler for rate limit type = key. Crossed TPM / RPM / Max Parallel Request Limit. current rpm: 1, rpm limit: 1, current tpm: 348, tpm limit: 9223372036854775807, current max_parallel_requests: 0, max_parallel_requests: 9223372036854775807",
"type": "None",
"param": "None",
"code": "429"
}
}
Useful Linksโ
- Creating Virtual Keys
- Key Management API Endpoints Swagger
- Set Budgets / Rate Limits per key/user/teams
- Dynamic TPM/RPM Limits for keys
Key Conceptsโ
This section explains key concepts on LiteLLM AI Gateway.
Understanding Model Configurationโ
For this config.yaml example:
model_list:
- model_name: gpt-4o
litellm_params:
model: azure/my_azure_deployment
api_base: os.environ/AZURE_API_BASE
api_key: "os.environ/AZURE_API_KEY"
api_version: "2025-01-01-preview" # [OPTIONAL] litellm uses the latest azure api_version by default
How Model Resolution Works:
Client Request LiteLLM Proxy Provider API
โโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโ
POST /chat/completions
{ 1. Looks up model_name
"model": "gpt-4o" โโโโโโโโโโโถ in config.yaml
...
} 2. Finds matching entry:
model_name: gpt-4o
3. Extracts litellm_params:
model: azure/my_azure_deployment
api_base: https://...
api_key: sk-...
4. Routes to provider โโโถ Azure OpenAI API
POST /deployments/my_azure_deployment/...
Breaking Down the model Parameter under litellm_params:
model_list:
- model_name: gpt-4o # What the client calls
litellm_params:
model: azure/my_azure_deployment # <provider>/<model-name>
โโโโโ โโโโโโโโโโโโโโโโโโโ
โ โ
โ โโโโโโโถ Model name sent to the provider API
โ
โโโโโโโโโโโโโโโโโโโถ Provider that LiteLLM routes to
Visual Breakdown:
model: azure/my_azure_deployment
โโโฌโโ โโโโโโโโโโโฌโโโโโโโโโโ
โ โ
โ โโโโโโถ The actual model identifier that gets sent to Azure
โ (e.g., your deployment name, or the model name)
โ
โโโโโโโโโโโโโโโโโโโโถ Tells LiteLLM which provider to use
(azure, openai, anthropic, bedrock, etc.)
Key Concepts:
-
model_name: The alias your client uses to call the model. This is what you send in your API requests (e.g.,gpt-4o). -
model(in litellm_params): Format is<provider>/<model-identifier>- Provider (before
/): Routes to the correct LLM provider (e.g.,azure,openai,anthropic,bedrock) - Model identifier (after
/): The actual model/deployment name sent to that provider's API
- Provider (before
Advanced Configuration Examples:
For custom OpenAI-compatible endpoints (e.g., vLLM, Ollama, custom deployments):
model_list:
- model_name: my-custom-model
litellm_params:
model: openai/nvidia/llama-3.2-nv-embedqa-1b-v2
api_base: http://my-service.svc.cluster.local:8000/v1
api_key: "sk-1234"
Breaking down complex model paths:
model: openai/nvidia/llama-3.2-nv-embedqa-1b-v2
โโโฌโโโ โโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโ
โ โ
โ โโโโโโถ Full model string sent to the provider API
โ (in this case: "nvidia/llama-3.2-nv-embedqa-1b-v2")
โ
โโโโโโโโโโโโโโโโโโโโโโโโถ Provider (openai = OpenAI-compatible API)
The key point: Everything after the first / is passed as-is to the provider's API.
Common Patterns:
model_list:
# Azure deployment
- model_name: gpt-4
litellm_params:
model: azure/gpt-4-deployment
api_base: https://my-azure.openai.azure.com
# OpenAI
- model_name: gpt-4
litellm_params:
model: openai/gpt-4
api_key: os.environ/OPENAI_API_KEY
# Custom OpenAI-compatible endpoint
- model_name: my-llama-model
litellm_params:
model: openai/meta/llama-3-8b
api_base: http://my-vllm-server:8000/v1
api_key: "optional-key"
# Bedrock
- model_name: claude-3
litellm_params:
model: bedrock/anthropic.claude-3-sonnet-20240229-v1:0
aws_region_name: us-east-1
Troubleshootingโ
Non-root docker image?โ
If you need to run the docker image as a non-root user, use this.
SSL Verification Issue / Connection Error.โ
If you see
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1006)
OR
Connection Error.
You can disable ssl verification with:
model_list:
- model_name: gpt-4o
litellm_params:
model: azure/my_azure_deployment
api_base: os.environ/AZURE_API_BASE
api_key: "os.environ/AZURE_API_KEY"
api_version: "2025-01-01-preview"
litellm_settings:
ssl_verify: false # ๐ KEY CHANGE
(DB) All connection attempts failedโ
If you see:
httpx.ConnectError: All connection attempts failed
ERROR: Application startup failed. Exiting.
3:21:43 - LiteLLM Proxy:ERROR: utils.py:2207 - Error getting LiteLLM_SpendLogs row count: All connection attempts failed
This might be a DB permission issue.
- Validate db user permission issue
Try creating a new database.
STATEMENT: CREATE DATABASE "litellm"
If you get:
ERROR: permission denied to create
This indicates you have a permission issue.
- Grant permissions to your DB user
It should look something like this:
psql -U postgres
CREATE DATABASE litellm;
On CloudSQL, this is:
GRANT ALL PRIVILEGES ON DATABASE litellm TO your_username;
What is litellm_settings?
LiteLLM Proxy uses the LiteLLM Python SDK for handling LLM API calls.
litellm_settings are module-level params for the LiteLLM Python SDK (equivalent to doing litellm.<some_param> on the SDK). You can see all params here
Support & Talk with foundersโ
-
Our emails โ๏ธ ishaan@berri.ai / krrish@berri.ai